What Is Multithreading? 


Before beginning, it is necessary to define precisely what is meant by the term multithreading. 
Multithreading is a specialized form of multitasking. In general, there are two types of 
multitasking: process-based and thread-based. A process is, in essence, a program that is 
executing. Thus, process-based multitasking is the feature that allows your computer to run 
two or more programs concurrently. For example, it is process-based multitasking that allows 
you to run a word processor at the same time you are using a spreadsheet or browsing the 
Internet. In process-based multitasking, a program is the smallest unit of code that can be 
dispatched by the scheduler. 

A thread is a dispatchable unit of executable code. The name comes from the concept of a 
"thread of execution." In a thread-based multitasking environment, all processes have at least 
one thread, but they can have more. This means that a single program can perform two or 
more tasks concurrently. For instance, a text editor can be formatting text at the same time 
that it is printing, as long as these two actions are being performed by two separate threads. 
The differences between process-based and thread-based multitasking can be summarized like 
this: Process-based multitasking handles the concurrent execution of programs. Thread-based 
multitasking deals with the concurrent execution of pieces of the same program. 

In the preceding discussions, it is important to clarify that true concurrent execution is 
possible only in a multiple-CPU system in which each process or thread has unrestricted 
access to a CPU. For single CPU systems, which constitute the vast majority of systems in use 
today, only the appearance of simultaneous execution is achieved. In a single CPU system, 
each process or thread receives a portion of the CPU's time, with the amount of time 
determined by several factors, including the priority of the process or thread. Although truly 
concurrent execution does not exist on most computers, when writing multithreaded 
programs, you should assume that it does. This is because you can't know the precise order in 
which separate threads will be executed, or if they will execute in the same sequence twice. 
Thus, its best to program as if true concurrent execution is the case. 

Multithreading Changes the Architecture of a Program 

Multithreading changes the fundamental architecture of a program. Unlike a single-threaded 
program that executes in a strictly linear fashion, a multithreaded program executes portions 
of itself concurrently. Thus, all multithreaded programs include an element of parallelism. 
Consequently, a major issue in multithreaded programs is managing the interaction of the 
threads. 

As explained earlier, all processes have at least one thread of execution, which is called the 
main thread. The main thread is created when your program begins. In a multithreaded 
program, the main thread creates one or more child threads. Thus, each multithreaded 
process starts with one thread of execution and then creates one or more additional threads. 

In a properly designed program, each thread represents a single logical unit of activity. 

The principal advantage of multithreading is that it enables you to write very efficient 
programs because it lets you utilize the idle time that is present in most programs. Most I/O 
devices, whether they are network ports, disk drives, or the keyboard, are much slower than 
the CPU. Often, a program will spend a majority of its execution time waiting to send or 
receive data. With the careful use of multithreading, your program can execute another task 
during this idle time. For example, while one part of your program is sending a file over the 
Internet, another part can be reading keyboard input, and still another can be buffering the 
next block of data to send. 


Why Doesn't C++ Contain Built-In Support for Multithreading? 



C++ does not contain any built-in support for multithreaded applications. Instead, it relies 
entirely upon the operating system to provide this feature. Given that both Java and C# 
provide built-in support for multithreading, it is natural to ask why this isn't also the case for 
C+ + . The answers are efficiency, control, and the range of applications to which C++ is 
applied. Let's examine each. 

By not building in support for multithreading, C++ does not attempt to define a "one size fits 
all" solution. Instead, C++ allows you to directly utilize the multithreading features provided 
by the operating system. This approach means that your programs can be multithreaded in 
the most efficient means supported by the execution environment. Because many multitasking 
environments offer rich support for multithreading, being able to access that support is crucial 
to the creation of high-performance, multithreaded programs. 

Using operating system functions to support multithreading gives you access to the full range 
of control offered by the execution environment. Consider Windows. It defines a rich set of 
thread-related functions that enable finely grained control over the creation and management 
of a thread. For example, Windows has several ways to control access to a shared resource, 
including semaphores, mutexes, event objects, waitable timers, and critical sections. This level 
of flexibility cannot be easily designed into a language because the capabilities of operating 
systems differ. Thus, language-level support for multithreading usually means offering only a 
"lowest common denominator" of features. With C++, you gain access to all the features that 
the operating system provides. This is a major advantage when writing high-performance 
code. 

C++ was designed for all types of programming, from embedded systems in which there is no 
operating system in the execution environment to highly distributed, GUI-based end-user 
applications and everything in between. Therefore, C++ cannot place significant constraints on 
its execution environment. Building in support for multithreading would have inherently limited 
C++ to only those environments that supported it and thus prevented C++ from being used 
to create software for nonthreaded environments. 

In the final analysis, not building in support of multithreading is a major advantage for C++ 
because it enables programs to be written in the most efficient way possible for the target 
execution environment. Remember, C++ is all about power. In the case of multithreading, it is 
definitely a situation in which "less is more." 

What Operating System and Compiler? 

Because C++ relies on the operating system to provide support for multithreaded 
programming, it is necessary to choose an operating system as the target for the 
multithreaded applications in this chapter. Because Windows is the most widely used operating 
system in the world, it is the operating system used in this chapter. However, much of the 
information can be generalized to any OS that supports multithreading. 

Because Visual C++ is arguably the most widely used compiler for producing Windows 
programs, it is the compiler required by the examples in this chapter. The importance of this is 
made apparent in the following section. However, if you are using another compiler, the code 
can be easily adapted to accommodate it. 

Windows offers a wide array of Application Programming Interface (API) functions that support 
multithreading. Many readers will be at least somewhat familiar with the multithreading 
functions offered by Windows, but for those who are not, an overview of those used in this 
chapter is presented here. Keep in mind that Windows provides many other multithreading- 
based functions that you might want to explore on your own. 


To use Windows' multithreading functions, you must include <windows.h> in your program. 



Creating and Terminating a Thread 


To create a thread, the Windows API supplies the CreateThread( ) function. Its prototype is 
shown here: 


HANDLE Create Thread (LPSECURITY_ATTRIBUTES secAttr, 

SIZE_T stackSize, 

LPTHREAD_START_ROUTINE threadFunc, 
LPVOID param, 

DWORD flags, 

LPDWORD threadID) ; 


Here, secAttr is a pointer to a set of security attributes pertaining to the thread. However, if 
secAttr is NULL, then the default security descriptor is used. 

Each thread has its own stack. You can specify the size of the new thread's stack in bytes 
using the stackSize parameter. If this integer value is zero, then the thread will be given a 
stack that is the same size as the creating thread. In this case, the stack will be expanded, if 
necessary. (Specifying zero is the common approach taken to thread stack size.) 

Each thread of execution begins with a call to a function, called the thread function, within the 
creating process. Execution of the thread continues until the thread function returns. The 
address of this function (that is, the entry point to the thread) is specified in threadFunc. All 
thread functions must have this prototype: 

DWORD WINAPI threadfunc( LPVOID param); 

Any argument that you need to pass to the new thread is specified in CreateThread( )'s 
param. This 32-bit value is received by the thread function in its parameter. This parameter 
may be used for any purpose. The function returns its exit status. 

The flags parameter determines the execution state of the thread. If it is zero, the thread 
begins execution immediately. If it is CREATE_SUSPEND, the thread is created in a suspended 
state, awaiting execution. (It may be started using a call to ResumeThread( ), discussed 
later.) 

The identifier associated with a thread is returned in the long integer pointed to by threadID. 

The function returns a handle to the thread if successful or NULL if a failure occurs. The thread 
handle can be explicitly destroyed by calling CloseHandle( ). Otherwise, it will be destroyed 
automatically when the parent process ends. 

As just explained, a thread of execution terminates when its entry function returns. The 
process may also terminate the thread manually, using either TerminateThread( ) or 
ExitThread( ), whose prototypes are shown here: 

BOOL TerminateThread(HANDLE thread, DWORD status ); 

VOID ExitThread( DWORD status); 

For TerminateThread( ), thread is the handle of the thread to be terminated. ExitThread( ) can 
only be used to terminate the thread that calls ExitThread( ). For both functions, status is the 
termination status. TerminateThread( ) returns nonzero if successful and zero otherwise. 


Calling ExitThread( ) is functionally equivalent to allowing a thread function to return normally. 
This means that the stack is properly reset. When a thread is terminated using 
TerminateThread( ), it is stopped immediately and does not perform any special cleanup 
activities. Also, TerminateThread( ) may stop a thread during an important operation. For 
these reasons, it is usually best (and easiest) to let a thread terminate normally when its entry 
function returns. 

The Visual C++ Alternatives to CreateThread( ) and ExitThread( ) 

Although CreateThread( ) and ExitThread( ) are the Windows API functions used to create and 
terminate a thread, we won't be using them in this chapter! The reason is that when these 
functions are used with Visual C++ (and possibly other Windows-compatible compilers), they 
can result in memory leaks, the loss of a small amount of memory. For Visual C++, if a 
multithreaded program utilizes C/C++ standard library functions and uses CreateThread( ) 
and ExitThread( ), then small amounts of memory are lost. (If your program does not use the 
C/C++ standard library, then no such losses will occur.) To eliminate this problem, you must 
use functions defined by the C/C++ runtime library to start and stop threads rather than those 
specified by the Win32 API. These functions parallel CreateThread( ) and ExitThread( ), but do 
not generate a memory leak. 

Priority Classes 

The Visual C++ alternatives to CreateThread( ) and ExitThread( ) are _beginthreadex( ) and 
_endthreadex( ). Both require the header file <process.h>. Here is the prototype for 
_beginthreadex( ): 

uintptr_t _beginthreadex(void *secAttr, unsigned stackSize, 

unsigned ( stdcall *threadFunc)(v oid *), 

void *param, unsigned flags, 
unsigned *threadID)-, 

As you can see, the parameters to _beginthreadex( ) parallel those to CreateThread( ). 
Furthermore, they have the same meaning as those specified by CreateThread( ). secAttr is a 
pointer to a set of security attributes pertaining to the thread. However, if secAttr is NULL, 
then the default security descriptor is used. The size of the new thread's stack, in bytes, is 
passed in stackSize parameter. If this value is zero, then the thread will be given a stack that 
is the same size as the main thread of the process that creates it. 

The address of the thread function (that is, the entry point to the thread) is specified in 
threadFunc. For _beginthreadex( ), a thread function must have this prototype: 

unsigned stdcall threadfunc(void * param ); 

This prototype is functionally equivalent to the one for CreateThread( ), but it uses different 
type names. Any argument that you need to pass to the new thread is specified in the param 
parameter. 

The flags parameter determines the execution state of the thread. If it is zero, the thread 
begins execution immediately. If it is CREATE_SUSPEND, the thread is created in a suspended 
state, awaiting execution. (It may be started using a call to ResumeThread( ).) The identifier 
associated with a thread is returned in the double word pointed to by threadID. 


The function returns a handle to the thread if successful or zero if a failure occurs. The type 
uintptr_t specifies a Visual C++ type capable of holding a pointer or handle. 



The prototype for _endthreadex( ) is shown here: 
void _endthreadex(unsigned status)-, 

It functions just like ExitThread( ) by stopping the thread and returning the exit code specified 
in status. 

Because the most widely used compiler for Windows is Visual C++, the examples in this 
chapter will use _beginthreadex( ) and _endthreadex( ) rather their equivalent API functions. 

If you are using a compiler other than Visual C++, simply substitute CreateThread( ) and 
EndThread( ). 

When using _beginthreadex( ) and _endthreadex( ), you must remember to link in the 
multithreaded library. This will vary from compiler to compiler. Here are some examples. 

When using the Visual C++ command-line compiler, include the -MT option. To use the 
multithreaded library from the Visual C++ 6 IDE, first activate the Project | Settings property 
sheet. Then, select the C/C++ tab. Next, select Code Generation from the Category list box 
and then choose Multithreaded in the Use Runtime Library list box. For Visual C++ 7 .NET IDE, 
select Project | Properties. Next, select the C/C++ entry and highlight Code Generation. 

Finally, choose Multi-threaded as the runtime library. 

Suspending and Resuming a Thread 

A thread of execution can be suspended by calling SuspendThread( ). It can be resumed by 
calling ResumeThread( ). The prototypes for these functions are shown here: 

DWORD SuspendThread( HANDLE hThread ); 

DWORD ResumeThread (HANDLE hThread)-, 

For both functions, the handle to the thread is passed in hThread. 

Each thread of execution has associated with it a suspend count. If this count is zero, then the 
thread is not suspended. If it is nonzero, the thread is in a suspended state. Each call to 
SuspendThread( ) increments the suspend count. Each call to ResumeThread( ) decrements 
the suspend count. A suspended thread will resume only when its suspend count has reached 
zero. Therefore, to resume a suspended thread implies that there must be the same number 
of calls to ResumeThread( ) as there have been calls to SuspendThread( ). 

Both functions return the thread's previous suspend count or -1 if an error occurs. 

Changing the Priority of a Thread 

In Windows, each thread has associated with it a priority setting. A thread's priority 
determines how much CPU time a thread receives. Low priority threads receive little time. 

High priority threads receive a lot. Of course, how much CPU time a thread receives has a 
profound impact on its execution characteristics and its interaction with other threads 
currently executing in the system. 

In Windows, a thread's priority setting is the combination of two values: the overall priority 
class of the process and the priority setting of the individual thread relative to that priority 
class. That is, a thread's actual priority is determined by combining the process's priority class 
with the thread's individual priority level. Each is examined next. 


By default, a process is given a priority class of normal, and most programs remain in the 
normal priority class throughout their execution lifetime. Although neither of the examples in 



this chapter changes the priority class, a brief overview of the thread priority classes is given 
here in the interest of completeness. 

Windows defines six priority classes, which correspond to the value shown here, in order of 
highest to lowest priority: 


REALTIME_PRIORITY_CLASS 


HIGH_PRIORITY_CLASS 


ABOVE_NORMAL_PRIORITY_CLASS 


NORMAL_PRIORITY_CLASS 


BELOW_NORMAL_PRIORITY_CLASS 


IDLE_PRIORITY_CLASS 


Programs are given the NORMAL_PRIORITY_CLASS by default. Usually, you won't need to 
alter the priority class of your program. In fact, changing a process' priority class can have 
negative consequences on the overall performance of the computer system. For example, if 
you increase a program's priority class to REALTIME_PRIORITY_CLASS, it will dominate the 
CPU. For some specialized applications, you may need to increase an application's priority 
class, but usually you won't. As mentioned, neither of the applications in this chapter changes 
the priority class. 

In the event that you do want to change the priority class of a program, you can do by calling 
SetPriorityClass( ). You can obtain the current priority class by calling GetPriorityClass( ). The 
prototypes for these functions are shown here: 

DWORD GetPriorityClass( HANDLE hApp ); 

BOOL SetPriorityClass(HANDLE hApp, DWORD priority)-, 

Here, hApp is the handle of the process. GetPriorityClass( ) returns the priority class of the 
application or zero on failure. For SetPriorityClass( ), priority specifies the process's new 
priority class. 

Thread Priorities 

For any given priority class, each individual thread's priority determines how much CPU time it 
receives within its process. When a thread is first created, it is given normal priority, but you 
can change a thread's priority— even while it is executing. 

You can obtain a thread's priority setting by calling GetThreadPriority( ). You can increase or 
decrease a thread's priority using SetThreadPriority( ). The prototypes for these functions are 
shown here: 

BOOL SetTh read Priority (HANDLE hThread, int priority)-, 
int GetThreadPriority(HANDLE hThread)-, 

For both functions, hThread is the handle of the thread. For SetThreadPriority( ), priority is the 
new priority setting. If an error occurs, SetThreadPriority( ) returns zero. It returns nonzero 


otherwise. For GetThreadPriority( ), the current priority setting is returned. The priority 
settings are shown here, in order of highest to lowest: 


Thread Priority 

Value 

TH READ_PRIORITY_TI M E_CRITICAL 

15 

THREAD_PRIORITY_HIGHEST 

2 

THREAD_PRIORITY_ABOVE_NORMAL 

1 

THREAD_PRIORITY_NORMAL 

0 

THREAD_PRIORITY_BELOW_NORMAL 

-1 

THREAD_PRIORITY LOWEST 

-2 

THREAD_PRIORITY_IDLE 

-15 


These values are increments or decrements that are applied relative to the priority class of the 
process. Through the combination of a process' priority class and thread priority, Windows 
supports 31 different priority settings for application programs. 

GetThreadPriority( ) returns THREAD_PRIORITY_ERROR_RETURN if an error occurs. 

For the most part, if a thread has the NORMAL_PRIORITY class, you can freely experiment 
with changing its priority setting without fear of catastrophically affecting overall system 
performance. As you will see, the thread control panel developed in the next section allows 
you to alter the priority setting of a thread within a process (but does not change its priority 
class). 

Obtaining the Handle of the Main Thread 

It is possible to control the execution of the main thread. To do so, you will need to acquire its 
handle. The easiest way to do this is to call GetCurrentThread( ), whose prototype is shown 
here: 

HANDLE GetCurrentThread(void); 

This function returns a pseudohandle to the current thread. It is called a pseudohandle 
because it is a predefined value that always refers to the current thread rather than 
specifically to the calling thread. It can, however, be used any place that a normal thread 
handle can. 

Synchronization 

When using multiple threads or processes, it is sometimes necessary to coordinate the 
activities of two or more. This process is called synchronization. The most common use of 
synchronization occurs when two or more threads need access to a shared resource that must 
be used by only one thread at a time. For example, when one thread is writing to a file, a 
second thread must be prevented from doing so at the same time. Another reason for 
synchronization is when one thread is waiting for an event that is caused by another thread. 


In this case, there must be some means by which the first thread is held in a suspended state 
until the event has occurred. Then the waiting thread must resume execution. 


There are two general states that a task may be in. First, it may be executing (or ready to 
execute as soon as it obtains its time slice). Second, a task may be blocked, awaiting some 
resource or event, in which case its execution is suspended until the needed resource is 
available or the event occurs. 

If you are not familiar with the synchronization problem or its most common solution, the 
semaphore, the next section discusses it. 

Understanding the Synchronization Problem 

Windows must provide special services that allow access to a shared resource to be 
synchronized, because without help from the operating system, there is no way for one 
process or thread to know that it has sole access to a resource. To understand this, imagine 
that you are writing programs for a multitasking operating system that does not provide any 
synchronization support. Further imagine that you have two concurrently executing threads, A 
and B, both of which, from time to time, require access to some resource R (such as a disk 
file) that must be accessed by only one thread at a time. As a means of preventing one thread 
from accessing R while the other is using it, you try the following solution. First, you establish 
a variable called flag that is initialized to zero and can be accessed by both threads. Then, 
before using each piece of code that accesses R, you wait for flag to be cleared, then set flag, 
access R, and finally, clear flag. That is, before either thread accesses R, it executes this piece 
of code: 


while (flag) ; // wait for flag to be cleared 
flag =1; // set flag 
// ... access resource R ... 
flag =0; // clear the flag 


The idea behind this code is that neither thread will access R if flag is set. Conceptually, this 
approach is in the spirit of the correct solution. However, in actual fact it leaves much to be 
desired for one simple reason: it won't always work! Let's see why. 

Using the code just given, it is possible for both processes to access R at the same time. The 
while loop is, in essence, performing repeated load and compare instructions on flag or, in 
other words, it is testing flag's value. When flag is cleared, the next line of code sets flag's 
value. The trouble is that it is possible for these two operations to be performed in two 
different time slices. Between the two time slices, the value of flag might have been accessed 
by the other thread, thus allowing R to be used by both threads at the same time. To 
understand this, imagine that thread A enters the while loop and finds that flag is zero, which 
is the green light to access R. However, before it can set flag to 1, its time slice expires and 
thread B resumes execution. If B executes its while, it too will find that flag is not set and 
assume that it is safe to access R. However, when A resumes it will also begin accessing R. 

The crucial aspect of the problem is that the testing and setting of flag do not comprise one 
uninterruptible operation. Rather, as just illustrated, they can be separated by a time slice. No 
matter how you try, there is no way, using only application-level code, that you can absolutely 
guarantee that one and only one thread will access R at one time. 

The solution to the synchronization problem is as elegant as it is simple. The operating system 
(in this case Windows) provides a routine that in one uninterrupted operation, tests and, if 
possible, sets a flag. In the language of operating systems engineers, this is called a test and 
set operation. For historical reasons, the flags used to control access to a shared resource and 
provide synchronization between threads (and processes) are called semaphores. The 
semaphore is at the core of the Windows synchronization system. 



Windows supports several types of synchronization objects. The first type is the classic 
semaphore. When using a semaphore, a resource can be completely synchronized, in which 
case one and only one thread or process can access it at any one time, or the semaphore can 
allow no more than a small number of processes or threads access at any one time. 
Semaphores are implemented using a counter that is decremented when a task is granted the 
semaphore and incremented when the task releases it. 

The second synchronization object is the mutex semaphore, or just mutex, for short. A mutex 
synchronizes a resource such that one and only one thread or process can access it at any one 
time. In essence, a mutex is a special case version of a standard semaphore. 

The third synchronization object is the event object. It can be used to block access to a 
resource until some other thread or process signals that it can be used. (That is, an event 
object signals that a specified event has occurred.) 

The fourth synchronization object is the waitable timer. A waitable timer blocks a thread's 
execution until a specific time. You can also create timer queues, which are lists of timers. 

You can prevent a section of code from being used by more than one thread at a time by 
making it into a critical section using a critical section object. Once a critical section is entered 
by one thread, no other thread may use it until the first thread has left the critical section. 

The only synchronization object used in this chapter is the mutex, which is described in the 
following section. However, all synchronization objects defined by Windows are available to 
the C++ programmer. As explained, this is one of the major advantages that results from 
C++'s reliance on the operating system to handle multithreading: all multithreading features 
are at your command. 

Using a Mutex to Synchronize Threads 

As explained, a mutex is a special-case semaphore that allows only one thread to access a 
resource at any given time. Before you can use a mutex, you must create one using 
CreateMutex( ), whose prototype is shown here: 

HANDLE CreateMutex(LPSECURITY_ATTRIBUTES secAttr, 

BOOL acquire, 

LPCSTR name)-, 

Here, secAttr is a pointer to the security attributes. If secAttr is NULL, the default security 
descriptor is used. 

If the creating thread desires control of the mutex, then acquire must be true. Otherwise, pass 
false. 

The name parameter points to a string that becomes the name of the mutex object. Mutexes 
are global objects, which may be used by other processes. As such, when two processes each 
open a mutex using the same name, both are referring to the same mutex. In this way, two 
processes can be synchronized. The name may also be NULL, in which case the semaphore is 
localized to one process. 

The CreateMutex( ) function returns a handle to the semaphore if successful or NULL on 
failure. A mutex handle is automatically closed when the main process ends. You can explicitly 
close a mutex handle when it is no longer needed by calling CloseHandle( ). 



Once you have created a semaphore, you use it by calling two related functions: 
WaitForSingleObject( ) and ReleaseMutex( ). The prototypes for these functions are shown 
here: 

DWORD WaitForSingleObject( HANDLE hObject, DWORD howLong)-, 

BOOL ReleaseMutex(HANDLE hMutex ); 

WaitForSingleObject( ) waits on a synchronization object. It does not return until the object 
becomes available or a time-out occurs. For use with mutexes, hObject will be the handle of a 
mutex. The howLong parameter specifies, in milliseconds, how long the calling routine will 
wait. Once that time has elapsed, a time-out error will be returned. To wait indefinitely, use 
the value INFINITE. The function returns WAIT_OBJECT_0 when successful (that is, when 
access is granted). It returns WAIT_TIMEOUT when time-out is reached. 

ReleaseMutex( ) releases the mutex and allows another thread to acquire it. Here, hMutex is 
the handle to the mutex. The function returns nonzero if successful and zero on failure. 

To use a mutex to control access to a shared resource, wrap the code that accesses that 
resource between a call to WaitForSingleObject( ) and ReleaseMutex( ), as shown in this 
skeleton. (Of course, the time-out period will differ from application to application.) 

if (WaitForSingleOb ject (hMutex, 10000) ==WAIT_TIMEOUT) { 

// handle time-out error 

} 

// access the resource 
ReleaseMutex (hMutex) ; 

Generally, you will want to choose a time-out period that will be more than enough to 
accommodate the actions of your program. If you get repeated time-out errors when 
developing a multithreaded application, it usually means that you have created a deadlock 
condition. Deadlock occurs when one thread is waiting on a mutex that another thread never 
releases. 

Creating a Thread Control Panel 

When developing multithreaded programs, it is often useful to experiment with various priority 
settings. It is also useful to be able to dynamically suspend and resume a thread, or even 
terminate a thread. As you will see, it is quite easy, using the thread functions just described, 
to create a thread control panel that allows you to accomplish these things. Further, you can 
use the control panel while your multithreaded program is running. The dynamic nature of the 
thread control panel allows you to easily change the execution profile of a thread and observe 
the results. 

The thread control panel developed in this section is capable of controlling one thread. 
However, you can create as many panels as needed, with each controlling a different thread. 
For the sake of simplicity, the control panel is implemented as a modeless dialog box that is 
owned by the desktop, not the application whose thread it controls. 

The thread control panel is capable of performing the following actions: 

• Setting a thread's priority 

• Suspending a thread 

• Resuming a thread 

• Terminating a thread 



As stated, the control panel is as a modeless dialog box. As you know, when a modeless dialog 
box is activated, the rest of the application is still active. Thus, the control panel runs 
independently of the application for which it is being used. 

The code for the thread control panel is shown here. This file is called tcp.cpp. 


// A thread control panel. 

#include <map> 

#include <windows.h> 

#include "panel. h" 
using namespace std; 
const int NUMPRIORITIES = 5; 
const int OFFSET = 2; 

// Array of strings for priority list box. 
char priorities [NUMPRIORITIES] [80] = { 

"Lowest " , 

"Below Normal", 

"Normal " , 

"Above Normal", 

"Highest " 

}; 

// A Thread Control Panel Class, 
class ThrdCtrlPanel { 

// Information about the thread under control, 
struct Threadlnfo { 

HANDLE hThread; // handle of thread 
int priority; // current priority 
bool suspended; / / true if suspended 
Threadlnfo (HANDLE ht, int p, bool s) { 
hThread = ht; 
priority = p; 
suspended = s; 

} 

}; 

// This map holds a Threadlnfo for each 
// active thread control panel, 
static map<HWND, ThreadInfo> dialogmap; 
public : 

// Construct a control panel. 

ThrdCtrlPanel (HINSTANCE hlnst, HANDLE hThrd); 

// The control panel's callback function. 

static LRESULT CALLBACK ThreadPanel (HWND hwnd, UINT message, 

WPARAM wParam, LPARAM IParam) ; 


}; 

// Define static member dialogmap. 
map<HWND, ThrdCtrlPanel: :ThreadInfo> 

ThrdCtrlPanel: : dialogmap; 

// Create a thread control panel. ThrdCtrlPanel :: ThrdCtrlPanel (HINSTANCE 

HANDLE hThrd) 


{ 

Threadlnfo ti (hThrd, 

GetThreadPriority (hThrd) +OFFSET, 
false) ; 

// Owner window is desktop. 

HWND hDialog = CreateDialog (hlnst, "ThreadPanelDB" , 
NULL, 

(DLGPROC) ThreadPanel); 

// Put info about this dialog box in the map. 
dialogmap . insert (pair<HWND, ThreadInfo> (hDialog, ti) ) ; 
// Set the control panel's title, 
char str[80] = "Control Panel for Thread 


hlnst , 


H . 


char str2 [ 4 ] ; 

_itoa (dialogmap . size ( ) , str2, 10); 
strcat (str, str2); 

SetWindowText (hDialog, str) ; 

// Offset each dialog box instance. 

MoveWindow (hDialog, 30*dialogmap . size ( ) , 
30*dialogmap . size ( ) , 

300, 250, 1); 

// Update priority setting in the list box. 
SendDlgltemMessage (hDialog, IDD_LB, LB_SETCURSEL, 

(WPARAM) ti. priority, 0); 

// Increase priority to ensure control. You can 
/ / change or remove this statement based on your 
// execution environment. 

SetThreadPriority (GetCurrentThread ( ) , 

THREAD_PRIORITY_ABOVE_NORMAL ) ; 


} 

// Thread control panel dialog box callback function. 

L RESULT CALLBACK ThrdCtrlPanel : : ThreadPanel (HWND hwnd, 

UINT message, 
WPARAM wParam, 
LPARAM IParam) 


{ 

int i ; 

HWND hpbRes, hpbSus, hpbTerm; 
switch (message) { 
case WM_INITDIALOG: 

// Initialize priority list box. 
for ( i=0 ; i<NUMPRIORITIES i++) { 

SendDlgltemMessage (hwnd, IDD_LB, 

LB_ADD STRING, 0, (LPARAM) priorities [ i ]) ; 

} 

// Set suspend and resume buttons for thread. 
hpbSus = GetDlgltem (hwnd, IDD_SUSPEND ) ; 
hpbRes = GetDlgltem (hwnd, IDD_RESUME) ; 

EnableWindow (hpbSus, true); // enable Suspend 
EnableWindow (hpbRes , false); // disable Resume 
return 1; 
case WM_COMMAND : 

map<HWND, ThreadInfo> :: iterator p = dialogmap . find (hwnd) 
switch (LOWORD (wParam) ) { 

case IDD_TERMINATE : 

TerminateThread (p->second . hThread, 0) ; 

// Disable Terminate button. 

hpbTerm = GetDlgltem (hwnd, IDD_TERMINATE) ; } 

EnableWindow (hpbTerm, false); // disable 
/ / Disable Suspend and Resume buttons . 
hpbSus = GetDlgltem (hwnd, IDD_SUSPEND ) ; 
hpbRes = GetDlgltem (hwnd, IDD_RESUME) ; 

EnableWindow (hpbSus , false); // disable Suspend 
EnableWindow (hpbRes , false); // disable Resume 
return 1; 

case IDD SUSPEND : 

SuspendThread (p-> second . hThread) ; 

// Set state of the Suspend and Resume buttons. 
hpbSus = GetDlgltem (hwnd, IDD_SUSPEND ) ; 
hpbRes = GetDlgltem (hwnd, IDD_RESUME) ; 

EnableWindow (hpbSus, false); // disable Suspend 
EnableWindow (hpbRes , true); // enable Resume 
p->second . suspended = true; 
return 1; 
case IDD_RESUME : 

ResumeThread (p-> second . hThread) ; 


// Set state of the Suspend and Resume buttons. 
hpbSus = GetDlgltem (hwnd, IDD_SUSPEND ) ; 
hpbRes = GetDlgltem (hwnd, IDD_RESUME) ; /' 

EnableWindow (hpbSus , true); // enable Suspend /' 
EnableWindow (hpbRes , false); // disable Resume 
p->second . suspended = false; 
return 1; 
case IDD_LB : 

// If a list box entry was clicked, 

// then change the priority, 
if (HIWORD (wParam) ==LBN_DBLCLK) { 
p->second . priority = SendDlgltemMessage (hwnd, 

IDD_LB , LB_GETCURSEL, / ' 

0 , 0 ); 

SetThreadPriority (p-> second . hThread, 

p->second . priority-OFFSET) ; 

} 

return 1; 
case IDCANCEL : 

// If thread is suspended when panel is closed, 

// then resume thread to prevent deadlock, 
if (p->second . suspended) { 

ResumeThread (p->second . hThread) ; 
p->second . suspended = false; 

} 

// Remove this thread from the list, 
dialogmap . erase (hwnd) ; 

/ / Close the panel . 

DestroyWindow (hwnd) ; ? 
return 1; 

} 

} 

return 0; 

} 


The control panel requires the following resource file, called tcp.rc: 

#include <windows.h> 

#include "panel. h" 

ThreadPanelDB DIALOGEX 20, 20, 140, 110 
CAPTION "Thread Control Panel" 

STYLE WS_B0RDER | WS_VISIBLE | WS_POPUP I WS_CAPTION | WS_SYSMENU 

{ 

DEFPUSHBUTTON "Done", IDCANCEL, 55, 80, 33, 14 
PUSHBUTTON "Terminate", IDD_TERMINATE, 10, 20, 42, 12 
PUSHBUTTON "Suspend", IDD_SUSPEND, 10, 35, 42, 12 
PUSHBUTTON "Resume", I D D_RE S UME , 10, 50, 42, 12 
LISTBOX IDD_LB, 65, 20, 63, 42, LBS_NOTIFY | WS_VISIBLE | 
WS_BORDER | WS_VSCROLL | WS_TABSTOP 
CTEXT "Thread Priority", IDD_TEXT1 , 65, 8, 64, 10 

CTEXT "Change State", IDD_TEXT2 , 0, 8, 64, 10 

} 

The control panel uses the following header file called panel. h: 


#define IDD_LB 200 
♦define IDD_TERMINATE 202 
♦define IDD_SUSPEND 204 
♦define IDD_RESUME 206 
♦define IDD_TEXT1 208 
♦define IDD_TEXT2 209 


To use the thread control panel, follow these steps: 


1. Include tcp.cpp in your program. 

2. Include tcp.rc in your program's resource file. 

3. Create the thread or threads that you want to control. 

4. Instantiate a ThrdCtrlPanel object for each thread. 

Each ThrdCtrlPanel object links a thread with a dialog box that controls it. For large projects in 
which multiple files need access to ThrdCtrlPanel, you will need to use a header file called 
tcp.h that contains the declaration for ThrdCtrlPanel. Here is tcp.h: 

// A header file for the ThrdCtrlPanel class, 
class ThrdCtrlPanel { 
public : 

// Construct a control panel. 

ThrdCtrlPanel (HINSTANCE hlnst, HANDLE hThrd) ; 

// The control panel's callback function. 

static LRESULT CALLBACK ThreadPanel (HWND hwnd, UINT message, 

WPARAM wParam, LPARAM IParam) ; 

}; 


Let's take a closer look at the thread control panel. It begins by defining the following global 
definitions: 

const int NUMPRIORITIES = 5; 
const int OFFSET = 2; 

// Array of strings for priority list box. 
char priorities [NUMPRIORITIES] [80] = { 

"Lowest " , 

"Below Normal", 

"Normal " , 

"Above Normal", 

"Highest " 

}; 


The priorities array holds strings that correspond to a thread's priority setting. It initializes the 
list box inside the control panel that displays the current thread priority. The number of 
priorities is specified by NUMPRIORITIES, which is 5 for Windows. Thus, NUMPRIORITIES 
defines the number of different priorities that a thread may have. (If you adapt the code for 
use with another operating system, a different value might be required.) Using the control 
panel, you can set a thread to one of the following priorities: 

THREAD__PRIORITY_HIGHEST 

THREAD__PRIORITY_ABOVE_NORMAL 

THREAD__PRIORITY_NORMAL 

THREAD__PRIORITY_BELOW_NORMAL 

THREAD__PRIORITY_LOWEST 

The other two thread priority settings: 


THREAD PRIORITY TIME CRITICAL 



THREAD_PRIORITY_IDLE 


are not supported because, relative to the control panel, they are of little practical value. For 
example, if you want to create a time-critical application, you are better off making its priority 
class time-critical. 

OFFSET defines an offset that will be used to translate between list box indexes and thread 
priorities. You should recall that normal priority has the value zero. In this example, the 
highest priority is TFIREAD_PRIORITY_HIGFIEST, which is 2. The lowest priority is 
TFIREAD_PRIORITY_LOWEST, which is -2. Because list box indexes begin at zero, the offset is 
used to convert between indexes and priority settings. 

Next, the ThrdCtrlPanel class is declared. It begins as shown here: 

// A Thread Control Panel Class, 
class ThrdCtrlPanel { 

// Information about the thread under control, 
struct Threadlnfo { 

HANDLE hThread; / / handle of thread 
int priority; // current priority 
bool suspended; // true if suspended 
Threadlnfo (HANDLE ht, int p, bool s) { 
hThread = ht; 
priority = p; 
suspended = s; 

} 

}; 

// This map holds a Threadlnfo for each 
// active thread control panel, 
static map<HWND, ThreadInfo> dialogmap; 

Information about the thread under control is contained within a structure of type Threadlnfo. 
The handle of the thread is stored in hThread. Its priority is stored in priority. If the thread is 
suspended, then suspended will be true. Otherwise, suspended will be false. 

The static member dialogmap is an STL map that links the thread information with the handle 
of the dialog box used to control that thread. Because there can be more than one thread 
control panel active at any given time, there must be some way to determine which thread is 
associated with which panel. It is dialogmap that provides this linkage. 

The ThreadCtrlPanel Constructor 

The ThrdCtrlPanel constructor is shown here. The constructor is passed the instance handle of 
the application and the handle of the thread being controlled. The instance handle is needed to 
create the control panel dialog box. 

// Create a thread control panel. ThrdCtrlPanel :: ThrdCtrlPanel (HINSTANCE hlnst, 

HANDLE hThrd) 

{ 

Threadlnfo ti (hThrd, 

GetThreadPriority (hThrd) +OFFSET, 
false) ; 

// Owner window is desktop. 

HWND hDialog = CreateDialog (hlnst , "ThreadPanelDB" , 

NULL, 

(DLGPROC) ThreadPanel) ; 

// Put info about this dialog box in the map. 
dialogmap . insert (pair<HWND, ThreadInfo> (hDialog, ti) ) ; 



} 


// Set the control panel's title. 

char str[80] = "Control Panel for Thread 

char str2 [ 4 ] ; 

_itoa (dialogmap . size ( ) , str2, 10); 
strcat(str, str2); 

SetWindowText (hDialog, str) ; 

// Offset each dialog box instance. 

MoveWindow (hDialog, 30*dialogmap . size ( ) , 
30*dialogmap . size ( ) , 

300, 250, 1); 

// Update priority setting in the list box. 
SendDlgltemMessage (hDialog, IDD_LB, LB_SETCURSEL, 

(WPARAM) ti. priority, 0); 
// Increase priority to ensure control. You can 
// change or remove this statement based on your 
/ / execution environment . 

SetThreadPriority (GetCurrent Thread ( ) , 

THRE AD_P R I OR I T Y_ABOVE_NORMAL ) ; 


The constructor begins by creating a Threadlnfo instance called ti that contains the initial 
settings for the thread. Notice that the priority is obtained by calling GetThreadPriority( ) for 
the thread being controlled. Next, the control panel dialog box is created by calling 
CreateDialog( ). CreateDialog( ) is a Windows API function that creates a modeless dialog box, 
which makes it independent of the application that creates it. The handle of this dialog box is 
returned and stored in hDialog. Next, hDialog and the thread information contained in ti are 
stored in dialogmap. Thus, the thread is linked with the dialog box that controls it. 

Next, the title of the dialog box is set to reflect the number of the thread. The number of the 
thread is obtained based on the number of entries in dialogmap. An alternative that you might 
want to try implementing is to explicitly pass a name for each thread to the ThrdCtrlPanel 
constructor. For the purposes of this chapter, simply numbering each thread is sufficient. 

Next, the control panel's position on the screen is offset a bit by calling MoveWindow( ), 
another Windows API function. This enables multiple panels to be displayed without each one 
fully covering the one before it. The thread's priority setting is then displayed in the priority 
list box by calling the Windows API function SendDlgItemMessage( ). 

Finally, the current thread has its priority increased to above normal. This ensures that the 
application receives enough CPU time to be responsive to user input no matter what is the 
priority level of the thread under control. This step may not be needed in all cases. You can 
experiment to find out. 

The ThreadPanel( ) Function 

ThreadPanel( ) is the Windows callback function that responds to user interaction with the 
thread control panel. Like all dialog box callback functions, it receives a message each time 
the user changes the state of a control. It is passed the handle of the dialog box in which the 
action occurred, the message, and any additional information required by the message. Its 
general mode of operation is the same as that for any other callback function used by a dialog 
box. The following discussion describes what happens for each message. 

When the thread control panel dialog box is first created, it receives a WM_INITDIALOG 
message, which is handled by this case sequence: 


caseWM_INITDIALOG : 

// Initialize priority list box. 


for ( i=0 ; i<NUMPRIORITIES i++) { 

SendDlgltemMessage (hwnd, IDD_LB, 

LB_ADD STRING, 0, (LPARAM) priorities [i ]) ; 

} 

// Set Suspend and Resume buttons for thread. 
hpbSus = GetDlgltem (hwnd, IDD_SUSPEND ) ; 
hpbRes = GetDlgltem (hwnd, IDD_RESUME) ; 
EnableWindow (hpbSus, true); // enable Suspend 
EnableWindow (hpbRes, false); // disable Resume 
return 1; 


This initializes the priority list box and sets the Suspend and Resume buttons to their initial 
states, which are Suspend enabled and Resume disabled. 

Each user interaction generates a WM_COMMAND message. Each time this message is 
received, an iterator to this dialog box's entry in dialogmap is retrieved, as shown here: 

case WM_COMMAND: 

map<HWND, ThreadInfo> : iterator p = dialogmap. find(hwnd); 

The information pointed to by p will be used to properly process each action. Because p is an 
iterator for a map, it points to an object of type pair, which is a structure defined by the STL. 
This structure contains two fields: first and second. These fields correspond to the information 
that comprises the key and the value, respectively. In this case, the handle is the key and the 
thread information is the value. 

A code indicating precisely what action has occurred is contained in the low-order word of 
wParam, which is used to control a switch statement that handles the remaining messages. 
Each is described next. 

When the user presses the Terminate button, the thread under control is stopped. This is 
handled by this case sequence: 


case IDD_TERMINATE : 

TerminateThread (p->second . hThread, 0) ; 

// Disable Terminate button. 
hpbTerm = GetDlgltem (hwnd, IDD_TERMINATE) ; 
EnableWindow (hpbTerm, false); // disable 
// Disable Suspend and Resume buttons. 
hpbSus = GetDlgltem (hwnd, IDD_SUSPEND ) ; 
hpbRes = GetDlgltem (hwnd, IDD_RESUME) ; 
EnableWindow (hpbSus, false); // disable Suspend 
EnableWindow (hpbRes , false); // disable Resume 
return 1; 


The thread is stopped with a call to TerminateThread( ). Notice how the handle for the thread 
is obtained. As explained, because p is an iterator for a map, it points to an object of type pair 
that contains the key in its first field and the value in its second field. This is why the thread 
handle is obtained by the expression p->second. hThread. After the thread is stopped, the 
Terminate button is disabled. 

Once a thread has been terminated, it cannot be resumed. Notice that the control panel uses 
TerminateThread( ) to halt execution of a thread. As mentioned earlier, this function must be 
used with care. If you use the control panel to experiment with threads of your own, you will 
want to make sure that no harmful side effects are possible. 


When the user presses the Suspend button, the thread is suspended. This is accomplished by 
the following sequence: 


case IDD SUSPEND : 

SuspendThread (p->second . hThread) ; 

// Set state of the Suspend and Resume buttons. 
hpbSus = GetDlgltem (hwnd, IDD_SUSPEND ) ; 
hpbRes = GetDlgltem (hwnd, IDD_RESUME) ; 
EnableWindow (hpbSus, false); // disable Suspend 
EnableWindow (hpbRes , true); // enable Resume 
p->second . suspended = true; 
return 1; 


The thread is suspended by a call to SuspendThread( ). Next, the state of the Suspend and 
Resume buttons are updated such that Resume is enabled and Suspend is disabled. This 
prevents the user from attempting to suspend a thread twice. 

A suspended thread is resumed when the Resume button is pressed. It is handled by this 
code: 


case IDD_RESUME : 

ResumeThread (p-> second . hThread) ; 

// Set state of the Suspend and Resume buttons. 
hpbSus = GetDlgltem (hwnd, IDD_SUSPEND ) ; 
hpbRes = GetDlgltem (hwnd, IDD_RESUME) ; 
EnableWindow (hpbSus, true); // enable Suspend 
EnableWindow (hpbRes, false); // disable Resume 
p->second . suspended = false; 
return 1; 


The thread is resumed by a call to ResumeThread( ), and the Suspend and Resume buttons 
are set appropriately. 

To change a thread's priority, the user double-clicks an entry in the Priority list box. This event 
is handled as shown next: 


case IDD_LB : 

// If a list box entry was double-clicked, 

// then change the priority. 

if (HIWORD (wParam) ==LBN_DBLCLK) { 

p->second . priority = SendDlgltemMessage (hwnd, 

IDD_LB, LB_GETCURSEL, 

0 , 0 ); 

SetThreadPriority (p-> second . hThread, 
p->second . priority-OFFSET) ; 

} 

return 1; 


List boxes generate various types of notification messages that describe the precise type of 
event that occurred. Notification messages are contained in the high-order word of wParam. 
One of these messages is LBN_DBLCLK, which means that the user double-clicked an entry in 
the box. When this notification is received, the index of the entry is retrieved by calling the 
Windows API function SendDlgItemMessage( ), requesting the current selection. This value is 
then used to set the thread's priority. Notice that OFFSET is subtracted to normalize the value 
of the index. 


Finally, when the user closes the thread control panel dialog box, the IDCANCEL message is 
sent. It is handled by the following sequence: 


case IDCANCEL: 

// If thread is suspended when panel is closed, 
// then resume thread to prevent deadlock, 
if (p->second . suspended) { 

ResumeThread (p-> second . hThread) ; 
p->second . suspended = false; 

} 

// Remove this thread from the list, 
dialogmap . erase (hwnd) ; 

// Close the panel. 

DestroyWindow (hwnd) ; 
return 1; 


If the thread was suspended, it is restarted. This is necessary to avoid accidentally 
deadlocking the thread. Next, this dialog box's entry in dialogmap is removed. Finally, the 
dialog box is removed by calling the Windows API function DestroyWindow( ). 

Here is a program that includes the thread control panel and demonstrates its use. Sample 
output is shown in Figure 3-2. The program creates a main window and defines two child 
threads. When started, these threads simply count from 0 to 50,000, displaying the count in 
the main window. These threads can be controlled by activating a thread control panel. 

To use the program, first begin execution of the threads by selecting Start Threads from the 
Threads menu (or by pressing F2) and then activate the thread control panels by selecting 
Control Panels from the Threads menu (or by pressing F3). Once the control panels are active, 
you can experiment with different priority settings and so on. 


// Demonstrate the thread control panel. 

#include <windows.h> 

#include <process.h> 

#include "thrdapp.h" 

#include "tcp.cpp" 
const int MAX = 500000; 

L RESULT CALLBACK WindowFunc (HWND, UINT, WPARAM, LPARAM) ; 

unsigned stdcall MyThreadl (void * param) ; 

unsigned stdcall MyThread2 (void * param) ; 

char str[255]; // holds output strings 
unsigned tidl, tid2; // thread IDs 
HANDLE hThreadl, hThread2; // thread handles 
HINSTANCE hlnst; // instance handle 

int WINAPI WinMain (HINSTANCE hThisInst, HINSTANCE hPrevInst, 
LPSTR args, int winMode) 

{ 

HWND hwnd; 

MSG msg; 

WNDCLASSEX wcl; 

HACCEL hAccel; 

// Define a window class. 

wcl . cbSize = sizeof (WNDCLASSEX) ; 

wcl . hlnstance = hThisInst; // handle to this instance 
wcl . IpszClassName = "MyWin"; // window class name 
wcl . lpfnWndProc = WindowFunc; // window function 
wcl . style =0; // default style 

wcl.hlcon = Loadlcon (NULL, IDI_APPLICATION) ; // large icon 

wcl.hlconSm = NULL; // use small version of large icon 


wcl.hCursor = LoadCursor (NULL, IDC_ARROW) ; // cursor style 

wcl . IpszMenuName = "ThreadAppMenu" ; // main menu 
wcl . cbClsExtra =0; // no extra memory needed 
wcl . cbWndExtra = 0; 

// Make the window background white. 

wcl . hbrBackground = (HBRUSH) GetStockOb ject (WHITE_BRUSH) ; 

// Register the window class. 

if ( ! RegisterClassEx ( &wcl) ) return 0; 

/* Now that a window class has been registered, a window 
can be created. */ 
hwnd = CreateWindow ( 

wcl . IpszClassName, // name of window class 
"Using a Thread Control Panel", // title 

WS OVERLAPPED WINDOW, // window style - normal 

CW_USEDEFAULT, // X coordinate - let Windows decide 
CW_USEDEFAULT, // Y coordinate - let Windows decide 
260, // width 

200, / / height 
NULL, // no parent window 
NULL, // no override of class menu 
hThisInst, // instance handle 
NULL //no additional arguments 
) ; 

hlnst = hThisInst; // save instance handle 
/ / Load the keyboard accelerators . 

hAccel = LoadAccelerators (hThisInst, "ThreadAppMenu"); 

// Display the window. 

ShowWindow (hwnd, winMode) ; 

UpdateWindow (hwnd) ; 

// Create the message loop. 

while (GetMessage ( &msg, NULL, 0, 0)) 

{ 

if ( ! TranslateAccelerator (hwnd, hAccel, &msg) ) { 

TranslateMessage (&msg) ; // translate keyboard messages 
DispatchMessage ( &msg) ; // return control to Windows 
} 

} 

return msg.wParam; 

} 

/* This function is called by Windows and is passed 
messages from the message queue. 

*/ 

L RESULT CALLBACK WindowFunc (HWND hwnd, UINT message, 

WPARAM wParam, LPARAM IParam) 

{ 

int response; 

switch (message) { 

case WM_COMMAND : 

switch (LOWORD (wParam) ) { 

case IDM_THREAD : / / create the threads 

hThreadl = (HANDLE) _beginthreadex (NULL, 0, 

MyThreadl, (void *) hwnd, 

0, Stidl) ; 

hThread2 = (HANDLE) _beginthreadex (NULL, 0, 

MyThread2, (void *) hwnd, 

0 , &t id2 ) ; 
break; 

case IDM_PANEL: // activate control panel 
ThrdCtrlPanel (hlnst , hThreadl); 

ThrdCtrlPanel (hlnst , hThread2); 
break; 

case IDM_EXIT : 

response = MessageBox (hwnd, "Quit the Program?", 


"Exit", MB_YESNO) ; 

if (response == IDYES) PostQuitMessage ( 0 ) ; 
break; 

case IDM_HELP : 

MessageBox (hwnd, 

"FI: Help\nF2 : Start Threads\nF3: Panel", 

"Help", MB_OK) ; 
break; 

} 

break; 

case WM_DESTROY : // terminate the program 
PostQuitMessage (0) ; 
break; 
default : 

return DefWindowProc (hwnd, message, wParam, IParam) ; 

} 

return 0; 

} 

// First thread. 

unsigned stdcall MyThreadl (void * param) 

{ 

int i ; 

HDC hdc; 

for ( i=0 ; i<MAX; i++) { 

wsprintf (str, "Thread 1: loop # %5d ", i) ; 
hdc = GetDC((HWND) param); 

TextOut (hdc, 1, 1, str, lstrlen (str) ) ; 

ReleaseDC ( (HWND) param, hdc); 

} 

return 0; 

} 

// Second thread. 

unsigned stdcall MyThread2 (void * param) 

{ 

int i ; 

HDC hdc; 

for ( i=0 ; i<MAX; i++) { 

wsprintf (str, "Thread 2: loop # %5d ", i) ; 
hdc = GetDC((HWND) param); 

TextOut (hdc, 1, 20, str, lstrlen (str) ) ; 

ReleaseDC ( (HWND) param, hdc); 

} 

return 0; 

} 


This program requires the header file thrdapp.h, shown here: 

#define IDM_THREAD 100 
#define IDM_HELP 101 
#define IDM_PANEL 102 
#define IDM_EXIT 103 


The resource file required by the program is shown here: 

#include <windows.h> 

#include "thrdapp.h" 

#include "tcp.rc" 

ThreadAppMenu MENU 

{ 

POPUP "&Threads" { 


MENUITEM "&Start Threads\tF2", IDM_THREAD 
MENUITEM "&Control Panels\tF3", IDM_PANEL 
MENUITEM "E&xit\tCtrl+X", IDM_EXIT 
} 

MENUITEM "&Help", IDM_HELP 

} 

ThreadAppMenu ACCELERATORS 

{ 

VK_F1, IDM_HELP, VIRTKEY 
VK_F2, IDM_THREAD, VIRTKEY 
VK_F3, IDM_PANEL, VIRTKEY 
" /V X", IDM_EXIT 
} 


Although controlling threads using the thread control panel is useful when developing 
multithreaded programs, ultimately it is using threads that makes them important. Toward 
this end, this chapter shows a multithreaded version of the GCPtr garbage collector class 
originally developed in Chapter 2. Recall that the version of GCPtr shown in Chapter 2 
collected unused memory each time a GCPtr object went out of scope. Although this approach 
is fine for some applications, often a better alternative is have the garbage collector run as a 
background task, recycling memory whenever free CPU cycles are available. The 
implementation developed here is designed for Windows, but the same basic techniques apply 
to other multithreaded environments. 

To convert GCPtr into a background task is actually fairly easy, but it does involve a number of 
changes. Here are the main ones: 

Member variables that support the thread must be added to GCPtr. These variables include the 
thread handle, the mutex handle, and an instance counter that keeps track of the number of 
GCPtr objects in existence. 

The constructor for GCPtr must begin the garbage collection thread. The constructor must also 
create the mutex that controls synchronization. This must happen only once, when the first 
GCPtr object is created. 

Another exception must be defined that will be used to indicate a time-out condition. 

The GCPtr destructor must no longer call collect( ). Garbage collection is handled by the 
garbage collection thread. 

A function called gc( ) that serves as the thread entry point for the garbage collector must be 
defined. 

A function called isRunning( ) must be defined. It returns true if the garbage collection is in 
use. 

The member functions of GCPtr that access the garbage collection list contained in gclist must 
be synchronized so that only one thread at a time can access the list. 

The following sections show the changes. 


The Additional Member Variables 

The multithreaded version of GCPtr requires that the following member variables be added: 



// These support multithreading, 
unsigned tid; // thread id 
static HANDLE hThrd; // thread handle 
static HANDLE hMutex; // handle of mutex 
static int instCount; // counter of GCPtr objects 

The ID of the thread used by the garbage collector is stored in tid. This member is unused 
except in the call to _beginthreadex( ). The handle to the thread is stored in hThrd. The 
handle of the mutex used to synchronize access to GCPtr is stored in hMutex. A count of GCPtr 
objects in existence is maintained in instCount. The last three are static because they are 
shared by all instances of GCPtr. They are defined like this, outside of GCPtr: 

template <class T, int size> 

int GCPtr<T, size> : :instCount = 0; 

template <class T, int size> 

HANDLE GCPtr<T, size> :: hMutex = 0; 
template <class T, int size> 

HANDLE GCPtrcT, size> : : hThrd = 0; 

The Multithreaded GCPtr Constructor 

In addition to its original duties, the multithreaded GCPtr( ) must create the mutex, start the 
garbage collector thread, and update the instance counter. Here is the updated version: 

// Construct both initialized and uninitialized objects. GCPtr (T *t=NULL) { 

// When first object is created, create the mutex 
// and register shutdown () . 
if (hMutex==0 ) { 

hMutex = CreateMutex (NULL, 0, NULL); 
atexit (shutdown) ; 

} 

if (WaitForSingleOb ject (hMutex, 10000) ==WAIT_TIMEOUT) 
throw TimeOutExc ( ) ; 
list<GCInf o<T> >::iterator p; 
p = f indPtrlnf o (t ) ; 

// If t is already in gclist, then 
// increment its reference count. 

// Otherwise, add it to the list, 
if (p != gclist . end () ) 

p->ref count++; // increment ref count 
else { 

// Create and store this entry. 

GCInfo<T> gcObj (t, size) ; 
gclist ,push_front (gcObj ) ; 

} 

addr = t ; 

arraySize = size; 

if (size > 0) isArray = true; 

else isArray = false; 

// Increment instance counter for each new object. 
instCount++; 

// If the garbage collection thread is not 
// currently running, start it running, 
if (hThrd==0 ) { 

hThrd = (HANDLE) _beginthreadex (NULL, 0, gc, 

(void *) 0, 0, (unsigned *) &tid) ; 

// For some applications, it will be better 
// to lower the priority of the garbage collector 
// as shown here: 

// 



// SetThreadPriority (hThrd, 

// THREAD_PRIORITY_BELOW_NORMAL) ; 

} 

ReleaseMutex (hMutex) ; 

} 

Let's examine this code closely. First, if hMutex is zero, it means that this is the first GCPtr 
object to be created and no mutex has yet been created for the garbage collector. If this is the 
case, the mutex is created and its handle is assigned to hMutex. At the same time, the 
function shutdown( ) is registered as a termination function by calling atexit( ). 

It is important to note that in the multithreaded garbage collector, shutdown( ) serves two 
purposes. First, as in the original version of GCPtr, shutdown( ) frees any unused memory that 
has not been released because of a circular reference. Second, when a program using the 
multithreaded garbage collector ends, it stops the garbage collection thread. This means that 
there might still be dynamically allocated objects that haven't been freed. This is important 
because these objects might have destructors that need to be called. Because shutdown( ) 
releases all remaining objects, it also releases these objects. 

Next, the mutex is acquired by calling WaitForSingleObject( ). This is necessary to prevent two 
threads from accessing gclist at the same time. Once the mutex has been acquired, a search 
of gclist is made, looking for any preexisting entry that matches the address in t. If one is 
found, its reference count is incremented. If no preexising entry matches t, a new GCInfo 
object is created that contains this address, and this object is added to gclist. Then, addr, 
arraySize, and isArray are set. These actions are the same as in the original version of GCPtr. 

Next, instCount is incremented. Recall that instCount is initialized to zero. Incrementing it 
each time an object is created keeps track of how many GCPtr objects are in existence. As 
long as this count is above zero, the garbage collector will continue to execute. 

Next, if hThrd is zero (as it is initially), then no thread has yet been created for the garbage 
collector. In this case, _beginthreadex( ) is called to begin the thread. A handle to the thread 
is then assigned to hThrd. The thread entry function is called gc( ), and it is examined shortly. 

Finally, the mutex is released and the constructor returns. It is important to point out that 
each call to WaitForSingleObject( ) must be balanced by a call to ReleaseMutex( ), as shown in 
the GCPtr constructor. Failure to release the mutex will cause deadlock. 

The TimeOutExc Exception 

As you probably noticed in the code for GCPtr( ) described in the preceding section, if the 
mutex cannot be acquired after 10 seconds, then a TimeOutExc is thrown. Frankly, 10 seconds 
is a very long time, so a time-out shouldn't ever happen unless something disrupts the task 
scheduler of the operating system. However, in the event it does occur, your application code 
may want to catch this exception. The TimeOutExc class is shown here: 

// Exception thrown when a time-out occurs 
// when waiting for access to hMutex. 

// 

class TimeOutExc { 

// Add functionality if needed by your application. 

}; 


Notice that it contains no members. Its existence as a unique type is sufficient for the 
purposes of this chapter. Of course, you can add functionality if desired. 



The Multithreaded GCPtr Destructor 


Unlike the single-threaded version of the GCPtr destructor, the multithreaded version of 
~GCPtr( ) does not call collect( ). Instead, it simply decrements the reference count of the 
memory pointed to by the GCPtr that is going out of scope. The actual collection of garbage (if 
any exists) is handled by the garbage collection thread. The destructor also decrements the 
instance counter, instCount. 

The multithreaded version of ~GCPtr( ) is shown here: 

// Destructor for GCPtr. 
template <class T, int size> 

GCPtr<T, size>: : -GCPtr () { 

if (WaitForSingleOb ject (hMutex, 10000) ==WAIT_TIMEOUT) 
throw TimeOutExc ( ) ; 
list<GCInf o<T> >::iterator p; 
p = f indPtrlnfo (addr) ; 

if (p->ref count ) p->refcount — ; // decrement ref count 
// Decrement instance counter for each object 
// that is destroyed. 
instCount--; 

ReleaseMutex (hMutex) ; 


The gc( ) Function 

The entry function for the garbage collector is called gc( ), and it is shown here: 

// Entry point for garbage collector thread, 
template <class T, int size> 

unsigned stdcall GCPtr<T, size> : : gc (void * param) { 

#if def DISPLAY 

cout << "Garbage collection started. \n"; 

#endif 

while ( isRunning () ) { 

collect ( ) ; 

} 

collect (); // collect garbage on way out 
// Release and reset the thread handle so 
// that the garbage collection thread can 
//be restarted if necessary. 

CloseHandle (hThrd) ; 
hThrd = 0; 

#if def DISPLAY 

cout << "Garbage collection terminated for " 

<< typeid (T) . name ( ) << "\n"; 

#endif 
return 0; 


The gc( ) function is quite simple: it runs as long as the garbage collector is in use. The 
isRunning( ) function returns true if instCount is greater than zero (which means that the 
garbage collector is still needed) and false otherwise. Inside the loop, collect( ) is called 
continuously. This approach is suitable for demonstrating the multithreaded garbage collector, 
but it is probably too inefficient for real-world use. You might want to experiment with calling 
collect( ) less often, such as only when memory runs low. You could also experiment by calling 
the Windows API function Sleep( ) after each call to collect( ). Sleep( ) pauses the execution of 
the calling thread for a specified number of milliseconds. While sleeping, a thread does not 
consume CPU time. 



When isRunning( ) returns false, the loop ends, causing gc( ) to eventually end, which stops 
the garbage collection thread. Because of the multithreading, it is possible that there will still 
be an entry on gclist that has not yet been freed even though isRunning( ) returns false. To 
handle this case, a final call to collect( ) is made before gc( ) ends. 

Finally, the thread handle is released via a call to the Windows API function CloseHandle( ), 
and its value is set to zero. Setting hThrd to zero enables the GCPtr constructor to restart the 
thread if later in the program new GCPtr objects are created. 

The isRunning( ) Function 

The isRunning( ) function is shown here: 

// Returns true if the collector is still in use. 
static bool isRunningO { return instCount >0; } 

It simply compares instCount to zero. As long as instCount is greater than 0, at least one 
GCPtr pointer is still in existence and the garbage collector is still needed. 

Many of the functions in GCPtr access gclist, which holds the garbage collection list. Access to 
gclist must be synchronized to prevent two or more threads from attempting to use it at the 
same time. The reason for this is easy to understand. If access were not synchronized, then, 
for example, one thread might be obtaining an iterator to the end of the list at the same time 
that another thread is adding or deleting an element from the list. In this case, the iterator 
would be invalid. To prevent such problems, each sequence of code that accesses gclist must 
be guarded by a mutex. The copy constructor for GCPtr shown here is one example: 

// Copy constructor. 

GCPtr (const GCPtr &ob) { 

if (WaitForSingleOb ject (hMutex, 10000) ==WAIT_TIMEOUT) 
throw TimeOutExc ( ) ; 
list<GCInf o<T> >::iterator p; 
p = f indPtrlnfo (ob . addr ) ; 
p->ref count+1; // increment ref count 
addr = ob.addr; 
arraySize = ob . arraySize; 
if (arraySize > 0) isArray = true; 
else isArray = false; 

instCount++; // increase instance count for copy 
ReleaseMutex (hMutex) ; 

} 

Notice that the first thing that the copy constructor does is acquire the mutex. Once acquired, 
it creates a copy of the object and adjusts the reference count for the memory being pointed 
to. On its way out, the copy constructor releases the mutex. This same basic method is 
applied to all functions that access gclist. 

Two Other Changes 

There are two other changes that you must make to the original version of the garbage 
collector. First, recall that the original version of GCPtr defined a static variable called first that 
indicated when the first GCPtr was created. This variable is no longer needed because hMutex 
now performs this function. Thus, remove first from GCPtr. Because it is a static variable, you 
will also need to remove its definition outside of GCPtr. 

In the original, single-threaded version of the garbage collector, if you defined the DISPLAY 
macro, you could watch the garbage collector in action. Most of that code has been removed 



in the multithreaded version because multithreading causes the output to be scrambled and 
unintelligible in most cases. For the multithreaded version, defining DISPLAY simply lets you 
know when the garbage collector has started and when it has stopped. 

The entire multithreaded version of the garbage collector is shown here. Call this file gcthrd.h. 


// A garbage collector that runs as a back ground task. 

#include <iostream> 

#include <list> 

#include <typeinfo> 

#include <cstdlib> 

#include <windows.h> 

#include <process.h> 
using namespace std; 

// To watch the action of the garbage collector, define DISPLAY. 
// #def ine DISPLAY 

// Exception thrown when an attempt is made to 
// use an Iter that exceeds the range of the 
// underlying object. 

// 

class OutOfRangeExc { 

// Add functionality if needed by your application. 

}; 

// Exception thrown when a time-out occurs 
// when waiting for access to hMutex. 

// 

class TimeOutExc { 

// Add functionality if needed by your application. 

}; 

// An iterator-like class for cycling through arrays 
// that are pointed to by GCPtrs. Iter pointers 
// ** do not ** participate in or affect garbage 
// collection. Thus, an Iter pointing to 
// some object does not prevent that object 
// from being recycled. 

// 

template <class T> class Iter { 

T *ptr; // current pointer value 
T *end; / / points to element one past end 

T *begin; // points to start of allocated array 
unsigned length; // length of sequence 
public : 

Iter() { 

ptr = end = begin = NULL; 
length = 0; 

} 

Iter(T *p, T *first, T *last) { 

Ptr = p; 

end = last; 

begin = first; 

length = last - first; 

} 

// Return length of sequence to which this 
/ / Iter points . 

unsigned size() { return length; } 

// Return value pointed to by ptr. 

//Do not allow out-of-bounds access. 

T Soperator* ( ) { 

if ( (ptr >= end) | | (ptr < begin) ) 
throw OutOfRangeExc () ; 
return *ptr; 


} 

// Return address contained in ptr. 

//Do not allow out-of-bounds access. 

T *operator-> ( ) { 

if ( (ptr >= end) I I (ptr < begin) ) 
throw OutOfRangeExc () ; 
return ptr; 

} 

// Prefix ++. 

Iter operator++() { 
ptr++; 

return *this; 

} 

// Prefix — . 

Iter operator — () { 

Ptr — ; 

return *this; 

} 

// Postfix ++ . 

Iter operators ( int notused) { 

T *tmp = ptr; 
ptr++; 

return Iter<T> (tmp, begin, end) ; 

} 

// Postfix 

Iter operator-- ( int notused) { 

T *tmp = ptr; 

Ptr — ; 

return Iter<T> (tmp, begin, end) ; 

} 

// Return a reference to the object at the 
// specified index. Do not allow out-of-bounds 
// access. 

T &operator[] (int i) { 

if ( (i < 0) || (i >= (end-begin) ) ) 

throw OutOfRangeExc () ; 
return ptr [i] ; 

} 

// Define the relational operators, 
bool operator== ( Iter op2) { 
return ptr == op2.ptr; 

} 

bool operator !=( Iter op2) { 
return ptr != op2.ptr; 

} 

bool operator< ( Iter op2) { 
return ptr < op2.ptr; 

} 

bool operator<= ( Iter op2) { 
return ptr <= op2.ptr; 

} 

bool operator> ( Iter op2) { 
return ptr > op2.ptr; 

} 

bool operator>= ( Iter op2) { 
return ptr >= op2.ptr; 

} 

// Subtract an integer from an Iter. 

Iter operator- ( int n) { 
ptr -= n; 
return *this; 

} 

// Add an integer to an Iter. 


Iter operator! ( int n) { 
ptr += n; 
return *this; 

} 

// Return number of elements between two Iters, 
int operator- (Iter<T> &itr2) { 
return ptr - itr2.ptr; 

} 

}; 

/ / This class defines an element that is stored 
// in the garbage collection information list . 

// 

template <class T> class GCInfo { 
public : 

unsigned ref count; // current reference count 
T *memPtr; // pointer to allocated memory 
/* isArray is true if memPtr points 
to an allocated array. It is false 
otherwise. */ 

bool isArray; / / true if pointing to array 
/* If memPtr is pointing to an allocated 

array, then arraySize contains its size */ 
unsigned arraySize; // size of array 
// Here, mPtr points to the allocated memory. 

// If this is an array, then size specifies 
// the size of the array. 

GCInfo (T *mPtr, unsigned size=0) { 
ref count = 1; 
memPtr = mPtr; 
if (size ! = 0) 
isArray = true; 
else 

isArray = false; 
arraySize = size; 

} 

}; 

// Overloading operator== allows GCInfos to be compared 
// This is needed by the STL list class. 

template <class T> bool operator== (const GCInfo<T> &obl 
const GCInfo<T> &ob2) { 

return (obi. memPtr == ob2. memPtr); 

} 

// GCPtr implements a pointer type that uses 
// garbage collection to release unused memory. 

// A GCPtr must only be used to point to memory 
// that was dynamically allocated using new. 

// When used to refer to an allocated array, 

// specify the array size. 

// 

template <class T, int size=0> class GCPtr { 

// gclist maintains the garbage collection list, 
static list<GCInf o<T> > gclist; 

// addr points to the allocated memory to which 
// this GCPtr pointer currently points. 

T *addr; 

/* isArray is true if this GCPtr points 
to an allocated array. It is false 
otherwise. */ 

bool isArray; // true if pointing to array 
// If this GCPtr is pointing to an allocated 
// array, then arraySize contains its size, 
unsigned arraySize; // size of the array 
// These support multithreading. 


unsigned tid; // thread id 
static HANDLE hThrd; // thread handle 
static HANDLE hMutex; // handle of mutex 
static int instCount; // counter of GCPtr objects 
// Return an iterator to pointer info in gclist . 
typename list<GCInf o<T> >:: iterator findPtrlnfo (T *ptr) 
public : 

// Define an iterator type for GCPtr<T> . 
typedef Iter<T> GCiterator; 

// Construct both initialized and uninitialized objects 
GCPtr (T *t=NULL) { 

// When first object is created, create the mutex 
// and register shutdown () . 
if (hMutex==0 ) { 

hMutex = CreateMutex (NULL, 0, NULL); 
atexit (shutdown) ; 

} 

if (WaitForSingleOb ject (hMutex, 10000) ==WAIT_TIMEOUT) 
throw TimeOutExc ( ) ; 
list<GCInf o<T> >::iterator p; 
p = findPtrlnfo (t) ; 

// If t is already in gclist, then 
// increment its reference count. 

// Otherwise, add it to the list, 
if (p != gclist . end () ) 

p->ref count++; // increment ref count 
else { 

// Create and store this entry. 

GCInfo<T> gcObj (t, size) ; 
gclist . push_f ront (gcObj ) ; 

} 

addr = t ; 

arraySize = size; 

if (size > 0) isArray = true; 

else isArray = false; 

// Increment instance counter for each new object. 
instCount++; 

// If the garbage collection thread is not 
// currently running, start it running, 
if (hThrd==0 ) { 

hThrd = (HANDLE) _beginthreadex (NULL, 0, gc, 

(void *) 0, 0, (unsigned *) &tid) ; 

// For some applications, it will be better 
// to lower the priority of the garbage collector 
// as shown here; 

// 

// SetThreadPriority (hThrd, 

// THREAD_PRIORITY_BELOW_NORMAL) ; 

} 

ReleaseMutex (hMutex) ; 

} 

// Copy constructor. 

GCPtr (const GCPtr &ob) { 

if (WaitForSingleOb ject (hMutex, 10000) ==WAIT_TIMEOUT) 
throw TimeOutExc () ; 
list<GCInf o<T> >:: iterator p; 
p = findPtrlnfo (ob . addr ) ; 
p->ref count++; // increment ref count 
addr = ob.addr; 
arraySize = ob . arraySize; 
if (arraySize > 0) isArray = true; 
else isArray = false; 

instCount++; // increase instance count for copy 


ReleaseMutex (hMutex) ; 

} 

// Destructor for GCPTr. 

-GCPtr () ; 

// Collect garbage. Returns true if at least 
// one object was freed, 
static bool collect (); 

// Overload assignment of pointer to GCPtr. 

T *operator=(T *t) ; 

// Overload assignment of GCPtr to GCPtr. 

GCPtr &operator= (GCPtr Srv) ; 

// Return a reference to the object pointed 
//to by this GCPtr. 

T Soperator* ( ) { 

return *addr; 

} 

// Return the address being pointed to. 

T *operator-> ( ) { return addr; } 

// Return a reference to the object at the 
// index specified by i. 

T Soperator [] (int i) { 
return addr[i]; 

} 

// Conversion function to T *. 
operator T *() { return addr; } 

// Return an Iter to the start of the allocated memory. Iter<T> begin () { 

int size; 

if(isArray) size = arraySize; 
else size = 1; 

return Iter<T> (addr, addr, addr + size) ; 

} 

// Return an Iter to one past the end of an allocated array. 

Iter<T> end() { 
int size; 

if (isArray) size = arraySize; 
else size = 1; 

return Iter<T> (addr + size, addr, addr + size); 

} 

// Return the size of gclist for this type 
/ / of GCPtr . 

static int gclistSizeO { 

if (WaitForSingleOb ject (hMutex, 10000) ==WAIT_TIMEOUT) 
throw TimeOutExc ( ) ; 
unsigned sz = gclist . size () ; 

ReleaseMutex (hMutex) ; 
return sz; 

} 

// A utility function that displays gclist. 
static void showlistO; 

// The following functions support multithreading. 

// 

// Returns true if the collector is still in use. 
static bool isRunningO { return instCount > 0; } 

// Clear gclist when program exits, 
static void shutdown (); 

// Entry point for garbage collector thread, 
static unsigned stdcall gc (void * param) ; 

}; 

// Create storage for the static variables, 
template <class T, int size> 

list<GCInf o<T> > GCPtr<T, size> :: gclist ; 
template <class T, int size> 

int GCPtr<T, size> :: instCount = 


0 ; 


template <class T, int size> 

HANDLE GCPtr<T, size> : : hMutex = 0; 
template <class T, int size> 

HANDLE GCPtr<T, size>::hThrd = 0; 

// Destructor for GCPtr. 
template <class T, int size> 

GCPtr<T, size>: : -GCPtr () { 

if ( Wait For SingleObject (hMutex, 10000) ==WAIT_TIMEOUT) 
throw TimeOutExc () ; 
list<GCInf o<T> >::iterator p; 
p = f indPtrlnfo (addr ) ; 

if (p->ref count ) p->refcount--; // decrement ref count 
// Decrement instance counter for each object 
// that is destroyed. 
instCount — ; 

ReleaseMutex (hMutex) ; 

} 

// Collect garbage. Returns true if at least 
// one object was freed, 
template <class T, int size> 
bool GCPtr<T, size> :: collect ( ) { 

if (Wait For SingleObject (hMutex, 10000) ==WAIT_TIMEOUT) 
throw TimeOutExc () ; 
bool memfreed = false; 
list<GCInf o<T> >::iterator p; 
do { 

// Scan gclist looking for unreferenced pointers, 
for (p = gclist . begin () ; p != gclist . end () ; p++) { 

// If in-use, skip, 
if (p->refcount > 0) continue; 
memfreed = true; 

// Remove unused entry from gclist. 
gclist . remove ( *p) ; 

// Free memory unless the GCPtr is null, 
if (p->memPtr) { 
if (p->isArray) { 

delete!] p->memPtr; // delete array 

} 

else { 

delete p->memPtr; // delete single element 

} 

} 

// Restart the search, 
break; 

} 

} while (p != gclist . end ()) ; 

ReleaseMutex (hMutex) ; 
return memfreed; 

} 

// Overload assignment of pointer to GCPtr. 
template <class T, int size> 

T * GCPtrkT, size> : : operator= (T *t) { 

if (Wait For SingleObject (hMutex, 10000) ==WAIT_TIMEOUT) 
throw TimeOutExc ( ) ; 
list<GCInf o<T> >:: iterator p; 

// First, decrement the reference count 
// for the memory currently being pointed to. 
p = f indPtrlnfo (addr) ; 
p->ref count — ; 

// Next, if the new address is already 
// existent in the system, increment its 
// count. Otherwise, create a new entry 
// for gclist. 


p = f indPtrlnfo (t ) ; 
if (p != gclist . end ( ) ) 
p->ref count ++; 
else { 

// Create and store this entry. 

GCInfo<T> gcObj (t, size) ; 
gclist . push_f ront (gcObj ) ; 

} 

addr = t; // store the address. 

ReleaseMutex (hMutex) ; 
return t; 

} 

// Overload assignment of GCPtr to GCPtr. 
template <class T, int size> 

GCPtr<T, size> & GCPtr<T, size> :: operator= (GCPtr Srv) { 
if (WaitForSingleOb ject (hMutex, 10000) ==WAIT_TIMEOUT) 
throw TimeOutExc ( ) ; 
list<GCInf o<T> >:: iterator p; 

// First, decrement the reference count 
// for the memory currently being pointed to. 
p = findPtrlnfo (addr ) ; 
p->ref count — ; 

// Next, increment the reference count of 
// of the new object, 
p = findPtrlnfo (rv. addr) ; 
p->ref count++; // increment ref count 
addr = rv.addr;// store the address. 

ReleaseMutex (hMutex) ; 
return rv; 

} 

// A utility function that displays gclist. 
template <class T, int size> 
void GCPtr<T, size> : : showlist ( ) { 

if (WaitForSingleOb ject (hMutex, 10000) ==WAIT_TIMEOUT) 
throw TimeOutExc () ; 
list<GCInf o<T> >::iterator p; 

cout << "gclist<" << typeid(T) .named << ", " 

<< size << ">:\n"; 

cout << "memPtr refcount value\n"; 

if (gclist . begin ( ) == gclist . end () ) { 

cout << " — Empty — \n\n"; 

return; 

} 

for (p = gclist . begin () ; p != gclist . end () ; p++) { 

cout << "[" << (void * ) p->memPtr << "]" 

<< " " << p->refcount << " "; 

if (p->memPtr) cout << " " << *p->memPtr; 

else cout << " "; 

cout << endl; 

} 

cout << endl; 

ReleaseMutex (hMutex) ; 

} 

// Find a pointer in gclist. 
template <class T, int size> 
typename list<GCInf o<T> >:: iterator 

GCPtr<T, size> :: findPtrlnfo (T *ptr) { 
list<GCInf o<T> >::iterator p; 

// Find ptr in gclist. 

for (p = gclist . begin () ; p != gclist . end () ; p++) 
if (p->memPtr == ptr) 
return p; 
return p; 


} 

// Entry point for garbage collector thread, 
template <class T, int size> 

unsigned stdcall GCPtr<T, size> : : gc (void * param) { 

#if def DISPLAY 

cout << "Garbage collection started. \n"; 

#endif 

while ( isRunning () ) { 

collect ( ) ; 

} 

collect (); // collect garbage on way out 
/ / Release and reset the thread handle so 
// that the garbage collection thread can 
//be restarted if necessary. 

CloseHandle (hThrd) ; 
hThrd = 0; 

#ifdef DISPLAY 

cout << "Garbage collection terminated for " 

<< typeid (T) . name ( ) << "\n"; 

#endif 
return 0; 

} 

// Clear gclist when program exits, 
template <class T, int size> 
void GCPtr<T, size> :: shutdown ( ) { 

if (gclistSize ( ) == 0) return; // list is empty 
list<GCInf o<T> >::iterator p; 

#ifdef DISPLAY 

cout << "Before collecting for shutdown () for " 

<< typeid (T) . name ( ) << "\n"; 

#endif 

for (p = gclist . begin () ; p != gclist . end () ; p++) { 

// Set all remaining reference counts to zero. 
p->ref count = 0; 

} 

collect ( ) ; 

#ifdef DISPLAY 

cout << "After collecting for shutdown () for " 

<< typeid (T) . name ( ) << "\n"; 

fendif 


To use the multithreaded garbage collector, include gcthrd.h in your program. Then, use GCPtr 
in the same way as described in Chapter 2. When you compile the program, you must 
remember to link in the multithreaded libraries, as explained earlier in this chapter in the 
section describing _beginthreadex( ) and endthreadex( ). 

To see the effects of the multithreaded garbage collector, try this version of the load test 
program originally shown in Chapter 2: 

// Demonstrate the multithreaded garbage collector. #include <iostream> 

#include <new> 

#include "gcthrd.h" 
using namespace std; 

// A simple class for load testing GCPtr. 
class LoadTest { 
int a, b; 
public : 

double n [100000]; // just to take-up memory 
double val; 

LoadTest () { a = b = 0; } 


LoadTest (int x, int y) { 
a = x; 
b = y; 
val = 0.0; 

} 

friend ostream &operator< (ostream Sstrm, LoadTest &obj); 

}; 

// Create an insertor for LoadTest. 

ostream &operator< (ostream Sstrm, LoadTest &obj) { 
strm << "(" << obj.a << " " << obj.b << 
return strm; 

} 

int main ( ) { 

GCPtr<LoadTest> mp; 
int i ; 

for(i = 1; i < 2000; i++) { 

try { 

mp = new LoadTest (i, i); 
if ( ! (i%100) ) 

cout << "gclist contains " << mp . gclistSize ( ) 

<< " entries. \n"; 

} catch (bad_alloc xa) { 

// For most users, this exception won't 
// ever occur. 

cout << "Last object: " << *mp << endl; 
cout << "Length of gclist: " 

<< mp . gclistSize ( ) << endl; 

} 

} 

return 0; 


Here is a sample run. (Of course, your output may vary.) This output was produced with the 
display option turned on by defining DISPLAY within gcthrd.h. 
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for class LoadTest 
for class LoadTest 


As you can see, because collect( ) is running in the background, gclist never gets very large, 
even though thousands of objects are being allocated and abandoned. 



Some Things to Try 


Creating successful multithreaded programs can be quite challenging. One reason for this is 
the fact that multithreading requires that you think of programs in parallel rather than linear 
terms. Furthermore, at runtime, threads interact in ways that are often difficult to anticipate. 
Thus, you might be surprised (or even bewildered) by the actions of a multithreaded program. 
The best way to get good at multithreading is to play with it. Toward this end, here are some 
ideas that you might want to try. 

Try adding another list box to the thread control panel that lets the user adjust the priority 
class of the thread in addition to its priority value. Try adding various synchronization objects 
to the control panel that can be turned on or off under user control. This will let you 
experiment with different synchronization options. 

For the multithreaded garbage collector, try collecting garbage less often, such as when gclist 
reaches a certain size or after free memory drops to a predetermined point. Alternatively, you 
could use a waitable timer to activate garbage collection on a regular basis. Finally, you might 
want to experiment with the garbage collector's priority class and settings to find which level 
is optimal for your use. 


What is DirectX? 

Before the release of Windows 95, most games were released for the DOS platform, usually using 
something like DOS4GW or some other 32-bit DOS extender to obtain access to 32-bit protected mode. 
Windows 95, however, seemed to signal the beginning of the end of the DOS prompt. Games developers 
began to wonder how they were going to write games optimally that would run under Windows 95 - games 
typically need to run in full-screen mode, and need to get as close as possible to your hardware. Windows 
95 seemed to be "getting in the way" of this. DOS had allowed them to program as "close to the metal" as 
possible, that is, get straight to the hardware, without going through layers of abstraction and 
encapsulation. In those days, the extra overhead of a generic API would have made games too slow. 

DirectX is comprised of application programming interfaces (APIs) that are grouped into two classes: the 
DirectX Foundation layer, and the DirectX Media layer. These APIs enable programs to directly access many 
of your computer""s hardware devices. 

The DirectX Foundation layer automatically determines the hardware capabilities of your computer and then 
sets your programs"" parameters to match. This allows multimedia applications to run on any Windows- 
based computer and at the same time ensures that the multimedia applications take full advantage of high- 
performance hardware. 

The DirectX Foundation layer contains a single set of APIs that provide improved access to the advanced 
features of high-performance hardware, such as 3-D graphics acceleration chips and sound cards. These 
APIs control low-level functions, including 2-D graphics acceleration; support for input devices such as 
joysticks, keyboards, and mice; and control of sound mixing and sound output. The low-level functions are 
supported by the components that make up the DirectX Foundation layer: 


Microsoft DirectDraw 

The Microsoft DirectDraw API supports extremely fast, direct access to the accelerated hardware capabilities 
of a computer""s video adapter. It supports standard methods of displaying graphics on all video adapters, 



and faster, more direct access when using accelerated drivers. DirectDraw provides a device-independent 
way for programs, such as games and two-dimensional (2-D) graphics packages, and Windows system 
components, such as digital video codecs, to gain access to the features of specific display devices without 
requiring any additional information from the user about the device""s capabilities. 

Microsoft Direct3D Immediate Mode 

The Microsoft Direct3D Immediate Mode API (Direct3D) provides an interface to the 3-D rendering functions 
built into most new video cards. Direct3D is a low-level 3-D API that provides a device-independent way for 
applications to communicate with accelerator hardware efficiently and powerfully. 

Direct3D provides application developers with many advanced features, such as: 

• Switchable depth buffering (using z-buffers or w-buffers) 

• Flat and Gouraud shading 

• Multiple lights and light types 

• Full material and texture support 

• Robust software emulation drivers 

• Transformation and clipping 

• Flardware independence 

• Full hardware acceleration on Windows 2000 (when the appropriate device drivers are available) 

• Built-in support for the specialized CPU instruction sets, including Intel""s MMX and Pentium III 
architectures, and the 3DNow! architecture 

Microsoft DirectSound 

The Microsoft DirectSound API provides a link between programs and an audio adapter""s sound mixing and 
playback capabilities. It also enables wave sound capture and playback. DirectSound provides multimedia 
applications with low-latency mixing, hardware acceleration, and direct access to the sound device. It 
provides this feature while maintaining compatibility with existing device drivers. 

Microsoft DirectMusic 

The Microsoft DirectMusic API is the musical component of DirectX. Unlike the DirectSound API, which 
captures and plays digital sound samples, DirectMusic works with message-based musical data that is 
converted to digital audio either by your sound card or by its built-in software synthesizer. As well as 
supporting input in Musical Instrument Digital Interface (MIDI) format, DirectMusic provides application 
developers the ability to create immersive, dynamic soundtracks that respond to user input. 

Microsoft Directlnput 

The Microsoft Directlnput API provides advanced input for games and processes input from joysticks as well 
as other related devices including the mouse, keyboard, and other game controllers, such as force-feedback 
game controllers. 

The DirectX Media layer works with the DirectX Foundation layer to provide high-level services that support 
animation, media streaming (transmission and viewing of audio and video as it is downloaded over the 
Internet), and interactivity. Like the DirectX Foundation layer, the DirectX Media layer is comprised of 
several integrated components that include: 


Microsoft Direct3D Retained Mode 



The Microsoft Direct3D Retained Mode API provides higher-level support for advanced, real-time, three- 
dimensional (3-D) graphics. Direct3D Retained Mode provides built-in support for graphics techniques like 
hierarchies and animation. Direct3D Retained Mode is built on top of Direct3D Immediate Mode. 


Microsoft DirectAnimation 

The Microsoft DirectAnimation API provides integration and animation for different types of media, such as 
two-dimensional images, three-dimensional objects, sounds, movies, text, and vector graphics. 

Microsoft DirectPlay 

The Microsoft DirectPlay API supports game connections over a modem, the Internet, or LAN. DirectPlay 
simplifies access to communication services and provides a way for games to communicate with each other, 
independent of the underlying protocol, or online service. 

Microsoft DirectShow 

The Microsoft DirectShow API plays multimedia files located in local files or on Internet servers, and 
captures multimedia streams from devices, such as video capture cards. DirectShow plays video and audio 
content compressed in various formats, including MPEG, audio-video interleaved (AVI), and WAV. 

Microsoft DirectX Transform 

The Microsoft DirectX Transform API enables application developers to create, animate, and edit digital 
images. DirectX Transform works with both two-dimensional (2-D) images and three-dimensional (3-D) 
images, which can be used to create stand-alone programs or dynamic plug-ins for Web graphics. 

CH 1 : What is Direct X and its Components 

What is DirectX? 

Before the release of Windows 95, most games were released for the DOS platform, usually 
using something like DOS4GW or some other 32-bit DOS extender to obtain access to 32-bit 
protected mode. Windows 95, however, seemed to signal the beginning of the end of the DOS 
prompt. Games developers began to wonder how they were going to write games optimally 
that would run under Windows 95 - games typically need to run in full-screen mode, and need 
to get as close as possible to your hardware. Windows 95 seemed to be "getting in the way" of 
this. DOS had allowed them to program as "close to the metal" as possible, that is, get 
straight to the hardware, without going through layers of abstraction and encapsulation. In 
those days, the extra overhead of a generic API would have made games too slow. 

So Microsoft's answer to this problem was a Software Development Kit (SDK) called DirectX. 
DirectX is a horrible, clunky, poorly-designed, poorly-documented, bloated, ugly, confusing 
beast ( * } of an API (Application Programming Interface) that has driven many a programmer to 
drink. It was originally purchased from a London company called RenderMorphics, and quietly 
released more or less as is as DirectX 2. DirectX 3 was probably the first "serious" release by 
Microsoft, who had now begun to actively push it as the games programming API of the 
future. Being the biggest software company on the planet, and being the developers of the 
Operating System that some 90% of desktop users were using, they succeeded. Hardware 
vendors quickly realised that following the Microsoft lead was the prudent thing to do, and 
everyone began to produce DirectX drivers for their hardware. In many ways this was a good 
thing for game developers. 



A lot of improvements have been made to the original DirectX. For example, the 
documentation doesn't suck as much as it originally did. Some of the poorly designed sections 
of the original API have been cleanup up and improved. Some of the really poorly designed 
sections of the original API have been removed. 

One of the main purposes of DirectX is to provide a standard way of accessing many different 
proprietary hardware devices. For example, Direct3D provides a "standard" programming 
interface that can be used to access the 3D hardware acceleration features of almost all 3D 
cards on the market which have Direct3D drivers written for them. In theory this is supposed 
to make it possible for one application to transparently run as it is supposed to across a wide 
variety of different hardware configurations. In practice, it usually isn't this simple. 

One of the reasons it isn't that simple, is that hardware (such as 3D graphics accelerators) 
normally only support a subset of the features available in DirectX, and you don't really want 
to use a feature if it isn't available in some sort of hardware accelerated form. To find out 
which features are available you have to query a device for a list of capabilities, and there can 
be many of these. 

The DirectX API is designed primarily for writing games, but can be used in other types of 
applications as well. The API at the moment has five main sections: 

DirectX Components 


DirectDraw 

2 dimensional graphics capabilities, surfaces, double buffering, etc 

Direct3D 

A relatively extensively functional 3D graphics programming API. 

DirectSound 

Sound; 3D sound 

DirectPlay 

Simplifies network game development 

Directlnput 

Handles input from various peripherals 


Additionally, DirectX 6 introduces something called DirectMusic, which is supposed to make it 
easier for game developers to include music in their games so that the mood of the music 
changes depending on what type of action is going on in the game. 

DirectX performance and hardware acceleration 

Although the performance of Direct3D in software only is not too shabby, it doesn't quite cut it 
for serious games. DirectX is designed with hardware acceleration in mind. It tries to provide 
the lowest possible level access to hardware, while still remaining a generic interface. Allowing 
functions such as 3D triangle drawing to be performed on the graphics card frees the CPU 
(Central Processing Unit) to do other things. Typical Direct3D hardware accelerators would 
also have at least 4 or preferably 16 or more Megabytes of onboard RAM to store texture 
maps (bitmapped images made up of small dots called "pixels"), textures, sprites, overlays 
and more. 

DirectDraw and Direct3D are built as a relatively thin layer above the hardware, using what is 
called the DirectDraw "hardware abstraction layer" (HAL). For functionality not provided by a 
certain card, an equivalent software implementation would be provided through the "hardware 
emulation layer" (HEL). 




Diagram illustrating where the DirectDraw/Direct3D architecture fits in, hopefully reasonably 

accurately. 


DirectX and COM 

The set of DirectX modules are built as COM (Component Object Model) objects. COM is yet 
another ugly broken interface from Microsoft - although newer versions of COM don't suck as 
much as the earlier incarnations. Don't get me wrong, I'm not against the existence of 
something that does what COM does - but the implementation leaves much to be desired. 
Anyway, a COM object is a bit like a C++ class, in that it encapsulates a set of methods and 
attributes in a single module, and in that it provides a kludgy sort of inheritance model, 
whereby one COM object can be built to support all the methods of it's parent object, and then 
add some more. 

You don't need to know much about COM to use DirectX, so don't worry too much about it. 

You do a little bit of COM stuff when initializing objects and cleaning them up, and when 
checking return values of function calls, but that's more or less it. 

CH 2 - Palettes, Gaming concepts, double buffering 

Video Modes 

Screen modes come in several flavours, based on how many bits are used to store the color of 
each pixel on the screen. Naturally, the more bits you use per pixel, the more colours you can 
display at once; but there is more data to move into graphics memory to update the screen. 

• 1,2,4 and 8 bit "indexed" modes (8 bit is the most popular and is better known as 
"256-color mode"). 

• 16-bit (64K colors) "high-color" modes 

• 24-bit (16. 7M colors) "true-color" modes 



• 32-bit RGBA modes. The first 3 bytes are used the same as in 24-bit modes; the A 
byte is for an "alpha-channel", which provides information about the opacity 
(transparency) of the pixel. 

These modes are available, typically, in the following resolutions: 

• 320x200 

• 320x240 

• 640x400 

• 640x480 

• 800x600 

• 1024x768 

• 1280x1024 

• 1600x1200 (drool) 

with 640x480 being probably the most common mode for running games in at the moment. 

Monitor's generally have a width that is 4/3 times their height (called the aspect ratio); so 
with modes where the number of pixels along the width is 4/3 times the number of pixels 
along the height, the pixels will have an aspect ratio of 1, and thus be physically square. That 
is to say, 100 pixels in one direction should then be the same physical length as 100 pixels in 
a perpendicular direction. Note that 320/200 does not have this property; so in 320x200 
pixels are actually stretched to be taller than they are wide. 

Color theory 

There are a number of different ways that colors can be represented, known as "color 
models". The most common one is probably RGB (Red, Green, Blue). Nearly all possible visible 
colors can be produced by combining, in various proportions, the three primary colors red, 
green and blue. These are commonly stored as three bytes - each byte represents the relative 
intensity of each primary color as a value from 0 to 255 inclusive. Pure bright red, for 
example, would be RGB(255,0,0). Purple would be RGB(255, 0,255), grey would be 
RGB(150,150,150), and so on. 

Here is an example of some C code that you might use for representing RGB colors. 



Alternatively you may want to store an RGB color in an unsigned 32-bit integer. Bits 0 to 7 are 
used to store the blue value, bits 8 to 15 for the green and so on. 


typedef unsigned int rgb_color; 


#define MAKE_RGB (r, g, b) ( ( (r) << 16) I ((g) << 8) | (b) ) 


Anyway, I'm rambling now. 

There are other color models, such as HSV (Hue, Saturation, Luminance), but I won't be going 
into them here. The book "Computer Graphics, principles and practise" by Foley & van Dam 
(often referred to as The Computer Graphics Bible) explains color modes in some detail, and 
how to convert between color modes. 

High-color and true-color modes 

In high-color and true-color modes, the pixels on the screen are stored in video memory as 
their corresponding RGB make-up values. For example, if the top left pixel on the screen was 
green, then (in true-color mode) the first three bytes in video memory would be 0, 255 and 0. 

In high-color modes the RGB values are specified using (if I remember correctly) 5, 6 and 5 
bits for red, green and blue respectively, so in the above example the first two bytes in video 
memory would be, in binary: OOOOOlll 11100000. 

Palette-based, or "indexed" modes 

Indexed color modes use the notion of a color "look up table" (LUT). The most common of 
these modes is 8-bit, better known as 256 color mode. Each pixel on the screen is represented 
by a single byte, which means that up to 2 8 can be displayed on the screen at once. The colors 
assigned to each of these 256 indexes are stored as 3 byte RGB values in the LUT, and these 
colors are used by the graphics hardware to determine what color to display on the screen. 

Creating an application using indexed modes can be a pain, especially for the graphics artist, 
but there are sometimes advantages to using indexed modes: 

• Less memory is required to store the information in bitmaps and on the screen. 

• Because less memory is required, drawing routines can be made faster, since there 
are fewer bytes to transfer. 

• Some interesting "palette animation" tricks, that would be quite difficult to do in a 
normal mode, can be done quite easily in indexed modes. By changing the values in 
the LUT, you can change the colors on the screen without modifying screen memory at 
all. For example, a fade-out can be done by fading the RGB values in the LUT to zero. 

• Some 3D accelerators support indexed modes for textures, which can be useful if (for 
example) you have a very large texture that takes up a lot of memory. 


ModeX 

ModeX is a special type of VGA 256 color mode in which the contents of graphics memory (i.e. 
what appears on the screen) is stored in a somewhat complex planar format. The resolution of 
ModeX modes isn't very high. DirectDraw knows how to write to ModeX surfaces, but the 
Windows GDI doesn't, so be careful when trying to mix GDI and DirectDraw ModeX surfaces. 
When setting the DirectDraw fullscreen mode, it is possible to choose whether or not 
DirectDraw is allowed to create ModeX surfaces. These days you probably want to avoid 
ModeX. 


Pitch/Stride 


Even though the screen resolution might be, say, 640x480x32, this does not necessarily mean 
that each row of pixels will take up 640*4 bytes in memory. For speed reasons, graphics cards 
often store surfaces wider than their logical width (a trade-off of memory for speed.) For 
example, a graphics card that supports a maximum of 1024x768 might store all modes from 
320x200 up to 1024x768 as 1024x768 internally. This leaves a "margin" on the right side of a 
surface. This actual allocated width for a surface is known as the pitch or stride of the 
surface. It is important to know the pitch of any surface whose memory you are going to write 
into, whether it is a 2D DirectDraw surface or a texture map. The pitch of a surface can be 
queried using DirectDraw. 

Text diagram illustrating pitch: 


Display memory: 

+ 

+ 

+ 

1 

I — screen width 

1 

1 

1 

1 

1 

I — pitch/stride 

1 

1 

1 

1 

1 

1 

1 

1 

| 

1 

1 

1 

1 

1 

1 

+ 

1 

1 

1 

+ 

1 

1 

1 

+ 


A few gaming concepts you'll need to know to write games 
Bitmaps and sprites 

A bitmap is an image on the computer that is stored as an array of pixel values. That's a 
pretty crappy description. Basically, a bitmap is any picture on the computer, normally a 
rectangular block of 'pixels'. A sprite is the same thing as a bitmap, except normally it refers 
to a bitmap that has transparent areas (exact definitions of sprite may vary from programmer 
to programmer.) Sprites are an extremely important component of games. They have a million 
and one uses. For example, your mouse cursor qualifies as a sprite. The monsters in DOOM 
are also sprites. They are flat images with transparent areas that are programmed to always 
face you. Note that the sprite always faces you - this doesn't mean the monster is facing you. 
Anyway, enough said about bitmaps and sprites, I think. 

Double buffering and page flipping 

If your game did all its drawing straight to the current display, the user would notice horribly 
flickery artefacts as the elements of the game got drawn onto the screen. The solution to this 
is to have two graphics buffers, a "front buffer" and a "back buffer". The front buffer is visible 
to the user, the back buffer is not. You do all your drawing to the back buffer, and then when 
you have finished drawing everything on the screen, you copy (or flip) the contents of the 
back buffer into the front buffer. This is known as double buffering, and some sort of double 
buffering scheme is used in virtually every game. 

There are generally two ways to perform the transfer of the back buffer to the front buffer: 
copying or page-flipping. 

• Copying: The contents of the back buffer are simply copied over into the front buffer. 
The back buffer can be in system memory or be another video memory surface. 


• Page-flipping: With this technique, no actual copying is done. Both buffers must exist 
in video memory. For each frame of your game you alternate which of these two 
surfaces you draw to. You always draw to the currently invisible one, and at the end of 
rendering the frame, you instruct the graphics hardware to use that frame as the 
visible one. Thus the front buffer becomes the back buffer (and vice versa) each 
frame. 

A problem that can arise from this technique is "tearing". Your monitor redraws the image on 
the screen fairly frequently, normally at around 70 times per second (or 70 Hertz). It normally 
draws from top to bottom. Now, it can happen that the screen has only drawn half of its 
image, when you decide to instruct it to start drawing something else, using any one of the 
two techniques described above. When you do this, the bottom half of the screen is drawn 
using the new image, while the top half still had the old image. The visual effect this produces 
is called tearing, or shearing. A solution exists, however. It is possible to time your page 
flipping to co-incide with the end of a screen refresh. I'll stop here though, having let you 
know that it is possible, (fixme: i think DirectDraw handles this for you, check this) 

Clipping and DirectDraw clippers 

Clipping is the name given to the technique of preventing drawing routines from drawing off 
the edge of the screen or other rectangular bounding area such as a window. If not 
performed, the general result could best be described as a mess. In DirectDraw, for example, 
when using windowed mode; Windows basically gives DirectDraw the right to draw anywhere 
on the screen that it wants to. However, a well-behaved DirectDraw application would 
normally only draw into it's own window. DirectX has an object called a "clipper" that can be 
attached to a DirectDraw surface to prevent it drawing outside of the window. 

DirectDraw surfaces 

DirectDraw uses "surfaces" to access any section of memory, either video memory or system 
memory, that is used to store (normally) bitmaps, texture maps, sprites, and the current 
contents of the screen or a window. 

DirectDraw also provides support for "overlays"; a special type of sprite. An overlay is 
normally a surface containing a bitmap with transparent sections that will be "overlaid" on the 
entire screen. For example, a racing car game might use an overlay for the image of the 
cockpit controls and window frame. 

The memory a DirectDraw surface uses can be lost in some circumstances, because 
DirectDraw has to share resources with the GDI. It is necessary for your application to check 
regularly that this hasn't happened, and to restore the surfaces if it has. 

DirectX return values and error-checking 

All DirectX functions return an HRESULT as an error-code. Since DirectX objects are based on 
the COM architecture, the correct way to check if a DirectX function has failed is to use the 
macros SUCCEEDED() and FAILED(), with the HRESULT as the parameter. It is not merely 
sufficient to check if, for example, your DirectDraw HRESULT is equal to DD_OK, since it is 
possible for COM objects to have multiple return values as success values. Your code will 
probably still work, but technically it is the wrong thing to do. 

Something to be on the lookout for, is that some DirectX functions return failure codes when 
they succeed. For example, IDirectPlay::GetPlayerData will "fail" with 
DPERR_BUFFERTOOSMALL when you are merely asking for the data size. This behaviour 



isn't documented either, which is incredibly frustrating. There aren't many of these, but be on 
the lookout.) 

DirectX debugging 

When you install the DirectX SDK you get a choice of whether to install the retail version of 
the libraries, or the debug version. The debug version will actually write diagnostic 
OutputDebugString messages to your debugger. This can be very useful. However, it slows 
things down a LOT - if you have anything less than a Pentium 166, rather choose the release 
libraries. Also, if you want to mainly play DirectX games, install the retail version. If you want 
to do mainly DirectX development, and your computer is quite fast, install the debug version. 
If you want to do both, then you should probably use the retail libraries, unless you have a 
very fast computer that can handle it. I normally install the retail version, but the debug 
versions can probably be quite useful for people starting out. 

CH 3: A simple DirectDraw sample 

This is a very simple DirectDraw sample. The source code for this sample is included Here. 

Setting up DirectX under Visual C/C++ 

I most likely won't be doing DirectX development under Watcom or Borland C/C++ or Delphi 
or VisualBasic etc; so if you want such info included here, you'll have to send it to me. 

Firstly, the directories must be set up so that Visual C/C++ can find the DirectX include files 
and libraries: 

1. Access the Tools/Options/Directories tabbed dialog. 

2. Select "library directories" from the drop-down list, and add the directory of the DX 
SDK libraries, e.g. "d:\dxsdk\sdk\lib" 

3. Select "include directories" from the drop-down list, and add the directory of the DX 
SDK header files, e.g. "d:\dxsdk\sdk\inc". 

4. If you are going to be using some of the DX utility headers used in the samples, then 
also add the samples\misc directory, e.g. "d:\dxsdk\sdk\samples\misc" to your 
includes path. 

Note that the version of DirectX that normally ships with Visual C++ isn't usually the latest, so 
to make sure that the compiler doesn't find the older version located in its own directories, 
add the include and library paths for the SDK in front of the default include and library paths. 

You must also for each application that uses DirectX explicitly add the required libraries to the 
project. Do this in Project/ Settings (Alt+F7), under the "Link" tab for each configuration of 
your project. For DirectDraw, add ddraw.lib in the Object/ Library modules box. You also 
need to add dxguid.lib here if your application uses any of the DirectX COM interface ID's, eg 

IID_IDirectDraw7 

The DirectDraw sample 


Here is a screenshot of the application: 


The general outline of our sample DirectDraw application is as follows: 

1. Create a normal Windows window 

2. Set up our DirectX variables 

3. Initialize a DirectDraw object 

4. Set the "cooperative level" and display modes as necessary (explained later) 

5. Create front and back surfaces 

6. If in windowed mode, create and attach a clipper 

7. Render to the back buffer 

8. Perform the flipping. If in full-screen mode, just flip. If in windowed mode, you need to 
blit from the back surface to the primary surface each frame. 

9. Repeat from step 7 until we exit 

10. Clean up 

Setting up 

We are going to need a number of variables for our DirectDraw application. These can be 
global variables or class members, thats up to you. The same goes for functions. Here are the 
variables we're going to use: 


LPDIRECTDRAW g_pDD; // DirectDraw object 

LPDIRECTDRAWSURFACE g_pDDSPrimary; // DirectDraw primary surface 
LPDIRECTDRAWSURFACE g_pDDSBack; // DirectDraw back surface 

LPDIRECTDRAWCLIPPER g_pClipper; // Clipper for windowed mode 

HWND g_hWnd; // Handle of window 

bool g_bFullScreen; // are we in fullscreen mode? 


All of these variables and functions I place in a seperate file, which can be called anything you 
want, although you should not use file names that already exist, such as "ddraw.h". This 
compiler is likely to get confused about which one you want. I've used dd.h and dd.cpp in the 
sample. 

Remember to ensure that these variables are initialized to NULL before we begin. If you were 
creating classes, you could do this in the constructor of the class. 


Here is the general layout of my dd.h and dd.cpp files: 


dd.h 

#ifndef _DD_H_ 

#define _DD_H_ 

#include <ddraw.h> 

extern LPDIRECTDRAW g_pDD ; 

extern void DirectXFunction () ; 

#endif 

dd.cpp 


#include "stdafx.h" 
#include "dd.h" 

#include <ddraw.h> 

LPDIRECTDRAW g_pDD=NULL ; 


void DirectXFunction ( ) 

{ 

g_pDD = NULL; 

} 


DirectDraw error checking 

Before we begin, we should define a "clean" way of checking and debugging error codes from 
DirectX functions. 

We create some functions to help us return and report error strings from HRESULT error 
codes. 

A function that returns a string with the name of an HRESULT code: 


char *DDErrorString (HRESULT hr) 

{ 

switch (hr) 

{ 

case DDERR_ALREADYINITIALIZED: return 

"DDERR_ALREADYINITIALIZED" ; 

case DDERR_CANNOTATTACHSURFACE : return 

"DDERR_CANNOTATTACHSURFACE " ; 

case DDERR_CANNOTDETACHSURFACE : return 

"DDERR_CANNOTDETACH SURFACE " ; 


case DDERR_CURRENTLYNOTAVAIL : 

return 


"DDERR_CURRENTLYNOTAVAIL " ; 

case DDERR_EXCEPTION : 

return 

"DDERR_EXCEP T I ON " ; 

case DDERR_GENERIC: 

return 

"DDERR_GENERIC" ; 

case DDERR_HEIGHTALIGN : 

return 

"DDERR_HEIGHTALIGN " ; 

case DDERR_INCOMPATIBLEPRIMARY : 

return 


"DDERR_INCOMPATIBLEPRIMARY" ; 

case DDERR_INVALIDCAPS : 

return 

"DDERR_INVALIDCAPS " ; 

case DDERR_INVALIDCLIPLIST : 

return 


"DDERR_INVALIDCLIPLIST " ; 

case DDERR_INVALIDMODE : 

return 

"DDERR_INVALIDMODE " ; 

case DDERR_INVALIDOB JECT : 

return 

"DDERR_INVALIDOB JECT" ; 

case DDERR_INVALIDPARAMS : 

return 

"DDERR_INVALIDPARAMS 

case DDERR_INVALIDP IXELFORMAT : 

return 


"DDERR_INVAL I DP IXELFORMAT " ; 

case DDERR_INVALIDRECT : 

return 

"DDERR_INVALIDRECT " ; 

case DDERR_LOCKED SURFACES : 

return 

"DDERR_LOCKEDSURFACES " ; 

case DDERR_N03D : 

return 

"DDERR_N03D" ; 

case DDERR_NOALPHAHW: 

return 

"DDE RR_N OALPHAHW" ; 

case DDERR_NOCLIPLIST : 

return 

"DDERR_NOCLIPLIST" ; 

case DDERR_NOCOLORCONVHW: 

return 

"DDERR_NOCOLORCONVHW " ; 

case DDERR_NOCOOPERATIVELEVELSET : 

return 


"DDERR_NOCOOPERATIVELEVELSET " ; 

case DDERR_NOCOLORKEY : 

return 

"DDERR_NOCOLORKEY " ; 

case DDERR_NOCOLORKEYHW : 

return 

"DDERR_NOCOLORKEYHW" ; 

case DDERR_NODIRECTDRAWSUPPORT : 

return 


"DDERR_NODIRECTDRAWSUPPORT " ; 

case DDERR_NOEXCLUS IVEMODE : 

return 


"DDERR_NOEXCLUS IVEMODE"; 

case DDERR_NOFLIPHW : 

return 

"DDERR_NOFLIPHW" ; 

case DDERR_NOGDI : 

return 

"DDERR_NOGDI " ; 

case DDERR_NOMIRRORHW : 

return 

"DDE RR_N OM I RRORH W " ; 

case DDERR_NOTFOUND : 

return 

"DDERR_NOTFOUND " ; 

case DDERR_NOOVERLAYHW: 

return 

"DDERR_NOOVERLAYHW" ; 

case DDERR_NORASTEROPHW : 

return 

"DDERR_NORASTEROPHW" ; 

case DDERR_NOROTATIONHW : 

return 

"DDERR_NOROTATIONHW" ; 

case DDERR_NOSTRETCHHW: 

return 

"DDERR_NOSTRETCHHW" ; 

case DDERR_NOT4BITCOLOR : 

return 

"DDERR_NOT4BITCOLOR" ; 

case DDERR_NOT4BITCOLORINDEX : 

return 


"DDERR_NOT4BITCOLORINDEX " ; 

case DDERR_NOT8BITCOLOR: 

return 

"DDERR_NOT8BITCOLOR" ; 

case DDERR_NOTEXTUREHW: 

return 

"DDERR_NOTEXTUREHW" ; 

case DDERR_NOVSYNCHW : 

return 

"DDERR_NOVSYNCHW " ; 

case DDERR_NOZBUFFERHW: 

return 

"DDERR_NOZBUFFERHW" ; 

case DDERR_NOZOVERLAYHW: 

return 

"DDE RR_N 0 Z OVE RL A Y H W " ; 

case DDERR_OUTOFCAPS : 

return 

"DDERR_OUTOFCAPS " ; 

case DDERR_OUTOFMEMORY : 

return 

" DDERR_OUTOFMEMORY " ; 

case DDERR_OUTOFVIDEOMEMORY : 

return 


"DDERR_OUTOFVIDEOMEMORY " ; 

case DDERR_OVERLAYCANTCLIP : 

return 


"DDERR_OVERLAYCANTCLIP " ; 

case DDERR_OVERLAYCOLORKEYONLYONE ACTIVE : 

return 


"DDE RR_0 VE RL A Y COLORKEYONLYONEACTIVE " ; 

case DDERR_PALETTEBUSY : 

return 

"DDERR_PALETTEBUSY" ; 

case DDERR_COLORKEYNOTSET : 

return 

"DDERR_COLORKEYNOTSET" ; 

case DDERR_SURFACEALREADYATTACHED : 

return 


"DDERR_SURFACEALREADYATTACHED " ; 

case DDERR_SURFACEALREADYDEPENDENT : 

return 


"DDERR_SURFACEALREADYDEPENDENT " ; 

case DDERR_SURFACEBUSY ; 

return 

"DDERR_SURFACEBUSY " ; 

case DDERR_CANTLOCKSURFACE : 

return 


"DDERR_CANTLOCKSURFACE " ; 


case DDERR_SURFACEISOBSCURED : 

return 


"DDERR_SURFACEISOBSCURED " ; 

case DDERR_SURFACELOST : 

return 

"DDERR_SURFACELOST " ; 

case DDERR_SURFACENOTATTACHED : 

return 


"DDERR_SURFACENOTATTACHED " ; 

case DDERR_TOOBIGHEIGHT : 

return 

"DDERR_TOOBIGHEIGHT" ; 

case DDERR_TOOBIGSIZE : 

return 

"DDERR_TOOBIGSI ZE " ; 

case DDERR_TOOBIGWIDTH : 

return 

"DDERR_TOOBIGWIDTH" ; 

case DDERR_UNSUPPORTED : 

return 

"DDERR_UNSUPPORTED " ; 

case DDERR_UNSUPPORTEDFORMAT : 

return 


"DDERR_UNSUPPORTEDFORMAT " ; 

case DDERR_UNSUPPORTEDMASK : 

return 


"DDERR_UNSUPPORTEDMASK" ; 

case DDERR_VERTICALBLANKINPROGRESS : 

return 


"DDE RR_VE RT I C AL B LANK INPROGRESS" ; 

case DDERR_WASSTILLDRAWING: 

return 


"DDERR_WASSTILLDRAWING"; 

case DDERR_XALIGN : 

return 

"DDE RR_X AL I GN " ; 

case DDERR_INVALIDDIRECTDRAWGUID : 

return 


"DDERR_INVALIDDIRECTDRAWGUID " ; 

case DDERR_DIRECTDRAWALREADYCREATED : 

return 


"DDERR_DIRECTDRAWALREADYCREATED " ; 

case DDERR_NODIRECTDRAWHW : 

return 

"DDERR_NODIRECTDRAWHW " ; 

case DDERR_PRIMARYSURFACEALREADYEXISTS : 

return 


"DDERR_PRIMARYSURFACEALREADYEXI STS " ; 

case DDERR_NOEMULATION : 

return 

"DDERR_NOEMULATION" ; 

case DDERR_REGIONTOOSMALL : 

return 

"DDERR_REGIONTOOSMALL " ; 

case DDERR_CLIPPERISUSINGHWND : 

return 


"DDERR_CLIPPERISUSINGHWND" ; 

case DDERR_NOCLIPPERATTACHED : 

return 


"DDERR_NOCLIPPERATTACHED" ; 

case DDERR_NOHWND : 

return 

"DDERR_NOHWND " ; 

case DDERR_HWNDSUBCLASSED : 

return 

"DDERR_HWNDSUBCLASSED " ; 

case DDERR_HWNDALREADYSET : 

return 

"DDERR_HWNDALREADYSET" ; 

case DDERR_NOPALETTEATTACHED : 

return 


"DDERR_NOPALETTEATTACHED " ; 

case DDERR_NOPALETTEHW: 

return 

"DDERR_NOPALETTEHW" ; 

case DDERR_BLTFASTCANTCLIP : 

return 


"DDERR_BLTFASTCANTCLIP " ; 

case DDERR_NOBLTHW : 

return 

"DDERR_NOBLTHW " ; 

case DDERR_NODDROPSHW : 

return 

"DDERR_NODDROPSHW " ; 

case DDERR_OVERLAYNOTVISIBLE : 

return 


"DDERR_OVERLAYNOTVI S IBLE " ; 

case DDERR_NOOVERLAYDEST : 

return 

"DDERR_NOOVERLAYDEST" ; 

case DDERR_INVALIDPOSITION: 

return 


"DDERR_INVALIDPOSITION" ; 

case DDE RR_N OTAOVERLAYSURFACE : 

return 


"DDE RR_N OTAOVERLAYSURFACE" ; 

case DDERR_EXCLUS IVEMODEALREADYSET : 

return 


"DDERR_EXCLUS IVEMODEALREADYSET" ; 

case DDERR_NOTFLIPPABLE : 

return 

"DDERR_NOTFLIPPABLE " ; 

case DDERR_CANTDUPLICATE : 

return 

"DDERR_CANTDUPLICATE " ; 

case DDERR_NOTLOCKED : 

return 

"DDERR_NOTLOCKED" ; 

case DDERR_CANTCREATEDC : 

return 

"DDERR_CANTCREATEDC " ; 

case DDERR_NODC: 

return 

"DDERR_NODC" ; 

case DDERR_WRONGMODE : 

return 

"DDERR_WRONGMODE " ; 

case DDERR_IMPLICITLYCREATED : 

return 


"DDERR_IMPLICITLYCREATED " ; 

case DDERR_NOTPALETTIZED : 

return 

"DDERR_NOTPALETTI ZED " ; 

case DDERR_UNSUPPORTEDMODE : 

return 


"DDERR_UNSUPPORTEDMODE " ; 

case DDERR NOMIPMAPHW: 

return 

"DDERR NOMIPMAPHW"; 





case DDERR_INVALIDSURFACETYPE : 
"DDERR_INVAL ID SURFACE TYPE " ; 

case DDERR_DCALREADYCREATED : 
"DDERR_DCALREADYCREATED " ; 

case DDERR_CANTPAGELOCK : 
case DDERR_CANTPAGEUNLOCK : 
case DDERR_NOTPAGELOCKED : 
case DDERR_NOTINITIALIZED : 

} 

return "Unknown Error"; 


return 

return 

return "DDERR_CANTPAGELOCK" ; 
return " DDERR_CANTP AGEUNLOCK " ; 
return "DDERR_NOTPAGELOCKED" ; 
return "DDERR_NOT INITIAL I ZED" ; 


A function that we can use in our code to help us check for errors. It checks if an HRESULT is a 
failure, and if it is, it prints a debugging message and returns true, otherwise it returns false. 


bool DDFailedCheck (HRESULT hr, char *szMessage) 

{ 

if (FAILED (hr)) 

{ 

char buf [ 1024 ] ; 

sprintf ( buf, "%s (%s)\n", szMessage, DDErrorString (hr ) ); 

OutputDebugString ( buf ); 
return true; 

} 

return false; 


Some lazy coders think that they can get away without doing much error checking. With 
DirectX, this is a very bad idea. You will have errors. 

Initializing the DirectDraw system 

After having created a Windows window (using MFC or plain Win32), we initialize the 
DirectDraw system, by creating an "IDirectDraw" object. 

The DirectDrawCreate or DirectDrawCreateEx function calls can be used to create a 
DirectDraw object. You only create a single DirectDraw object for your application 


bool DDInit ( HWND hWnd ) 

{ 

HRESULT hr; 
g_hWnd = hWnd; 

// Initialize DirectDraw 

hr = DirectDrawCreate ( NULL, &g_pDD, NULL ); 
if (DDFailedCheck (hr , "DirectDrawCreate failed" )) 
return false; 

return true; 

} 


Note that DirectDrawCreate will create an "old" DirectDraw that does not support the 
functions that "new" DirectDraw interfaces (such as an IDirectDraw7) does. Use 
DirectDrawCreateEx to create a DirectDraw interface that does. For our simple sample the 
above is sufficient. 

Setting the screen mode 

The remaining DirectDraw initialization (setting modes, creating surfaces and clippers) I place 
in a single function called CreateSurfaces. 

The function SetCooperativeLevel is used to tell the system whether or not we want to use 
full-screen mode or windowed mode. In full-screen mode, we have to get exclusive access to 
the DirectDraw device, and then set the display mode. For windowed mode, we set the 
cooperative level to normal. 


bool DDCreateSurf aces ( bool bFullScreen) 

{ 

HRESULT hr; // Holds return values for DirectX function calls 


g_bFullScreen = bFullScreen; 


// If we want to be in full-screen mode 
if (g_bFullScreen) 

{ 

// Set the "cooperative level" so we can use full-screen mode 
hr = g_pDD->SetCooperativeLevel (g_hWnd, 

DDSCL EXCLUS IVE | DDSCL_FULLSCREEN | DDSCL_NOWINDOWCHANGES ) ; 

if (DDFailedCheck (hr, "SetCooperativeLevel")) 
return false; 


} 

else 

{ 


} 


// Set 640x480x256 full-screen mode 
hr = g_pDD->SetDisplayMode ( 64 0 , 480, 8); 
if (DDFailedCheck (hr, "SetDisplayMode" )) 
return false; 


/ / Set DDSCL_NORMAL to use windowed mode 
hr = g_pDD->SetCooperativeLevel (g_hWnd, DDSCL_NORMAL) ; 
if (DDFailedCheck (hr, "SetCooperativeLevel windowed" )) 
return false; 


Creating surfaces 

OK ... now that we've got that bit of initialization out of the way, we need to create a flipping 
structure. No, I'm not cursing the structure .. "flipping" as in screen page-flipping :). 

Anyway, we need to create one main surface that everyone will see, and a "back" surface. All 
drawing is done to the back surface. When we are finished drawing we need to make what 
we've drawn visible. In full-screen mode, we just need to call a routine called Flip, which will 
turn the current back surface into the primary surface and vice versa. In windowed mode, we 
don't actually flip the surfaces - we copy the contents of the back buffer onto the primary 


buffer, which is what's inside the window. In other words, we "blit" the back surface onto the 
primary surface. 

Anyway, here is the bit of code to create the surfaces. Right now the code is ignoring full- 
screen mode and only catering for windowed mode, but that'll change. Also, if there are errors 
in this code, consider them "exercises" ... :). 


DDSURFACEDESC ddsd; // A structure to describe the surfaces we want 
// Clear all members of the structure to 0 
memset ( Sddsd, 0, sizeof (ddsd) ) ; 

// The first parameter of the structure must contain the size of the 
structure 

ddsd.dwSize = sizeof (ddsd) ; 


if (g_bFullScreen) 

{ 

/ / Screw the 

} 

else 


{ 


full-screen mode 


(for now) 


(FIXME) 


/ / — Create the primary surface 

// The dwFlags paramater tell DirectDraw which DDSURFACEDESC 
// fields will contain valid values 
ddsd . dwFlags = DDSD_CAPS; 

ddsd. ddsCaps . dwCaps = DDSCAPS_PRIMARYSURFACE ; 

hr = g_pDD->CreateSurf ace ( Sddsd, &g_pDDS, NULL) ; 
if (DDFailedCheck (hr, "Create primary surface")) 
return false; 

/ / — Create the back buffer 


ddsd. dwFlags = DDSD_WIDTH | DDSD_HEIGHT | DDSD_CAPS; 
// Make our off-screen surface 320x240 
ddsd.dwWidth = 320; 
ddsd . dwHeight = 240; 

// Create an offscreen surface 

ddsd. ddsCaps . dwCaps = DDSCAPS_OFFSCREENPLAIN; 

hr = g_pDD->CreateSurf ace ( Sddsd, &q pDDSBack, NULL); 
if (DDFailedCheck (hr, "Create back surface")) 
return false; 


} 


Creating the Clipper 

Now that we've created the surfaces, we need to create a clipper (if we're running in 
windowed mode), and attach the clipper to the primary surface. This prevents DirectDraw 
from drawing outside the windows client area. 


// — Create a clipper for the primary surface in windowed mode 
if ( ! g_bFullScreen) 

{ 


// Create the clipper using the DirectDraw object 
hr = g_pDD->CreateClipper ( 0 , &g_pClipper, NULL) ; 
if (DDFailedCheck (hr, "Create clipper")) 
return false; 

// Assign your window's HWND to the clipper 
hr = g_pClipper->SetHWnd ( 0 , g_hWnd) ; 
if (DDFailedCheck (hr, "Assign hWnd to clipper")) 
return false; 

// Attach the clipper to the primary surface 
hr = g_pDDS->SetClipper (g_pClipper ) ; 
if (DDFailedCheck (hr, "Set clipper")) 


Putting it all together 

Now that we have all these initialization routines, we need to actually call them, so the 
question is, where to call them? 

In an MFC application, a logical place to do this is in the application's Initlnstance routine: 


BOOL CYourAppNameHereApp : : Initlnstance ( ) 

{ 

. . . All the other MFC initialization junk here . . 

// Initialize DirectDraw 

if (!DDInit( AfxGetMainWnd ( ) ->GetSaf eHwnd ( ) )) 

{ 

AfxMessageBox ( "Failed to initialize DirectDraw" ) ; 
return FALSE; 

} 

/ / Create DirectDraw surfaces 
if ( ! DDCreateSurf aces ( false )) 

{ 

AfxMessageBox ( "Failed to create surfaces" ); 
return FALSE; 

} 

return TRUE; 

} 


In a plain Win32 application, you can do this in your WinMain function just before you enter 
the main message loop, but after you've created your window: 


int APIENTRY WinMain (HINSTANCE 

HINSTANCE 

LPSTR 

int 

{ 

MSG Msg; 


hlnstance, 

hPrevInstance, 

lpCmdLine, 

nCmdShow) 


g_hlnstance = hlnstance; 


if 

} 


(! hPrevInstance) { 

if (!Register( g_hlnstance )) 
return FALSE; 


/ / Create the main window 

g_hwndMain = Create ( nCmdShow, 320, 240 ); 
if ( ! g_hwndMain) 

return FALSE; 


// Initialize DirectDraw 
if (!DDInit( g_hwndMain )) 

{ 

MessageBox ( g_hwndMain, "Failed to initialize DirectDraw", 

"Error", MB_OK ); 

return 0; 

} 


MB_OK 


/ / Create DirectDraw surfaces 
if ( ! DDCreateSurf aces ( false )) 

{ 

MessageBox ( g_hwndMain, "Failed to create surfaces", "Error", 
return 0; 

} 


while (GetMessage ( &Msg, NULL, 0, 0) ) 

{ 

TranslateMessage (&Msg) ; 
DispatchMessage (&Msg) ; 

} 


} 


return Msg.wParam; 


Restoring lost surfaces 

As if all this initialization wasn't enough, we also have to make sure our DirectDraw surfaces 
are not getting "lost". The memory associated with DirectDraw surfaces can be released under 
certain circumstances, because it has to share resources with the Windows GDI. So each time 
we render, we first have to check if our surfaces have been lost and Restore them if they 
have. This is accomplished with the IsLost function. 


void CheckSurf aces ( ) 

{ 

/ / Check the primary surface 
if (g_pDDS) 

{ 


if (g_pDDS->IsLost ( ) == DDERR_SURFACELOST) 
g_pDDS->Restore ( ) ; 

} 

// Check the back buffer 
if (g_pDDSBack) 

{ 

if (g_pDDSBack->IsLost () == DDERR_SURFACELOST ) 
g_pDDSBack->Restore ( ) ; 

} 


The rendering loop 

Now that we've got most of the general initialization out of the way, we need to set up a 
rendering loop. This is basically the main loop of the game, the so-called HeartBeat function. 
So we're going to call it just that. 

The HeartBeat function gets called during your applications idle-time processing, which is 
typically whenever the window has no more messages to process. 

MFC: We can override the application's Onldle function and call our HeartBeat function from 
there. Use ClassWizard or the toolbar wizard to create a handler for "idle-time processing" for 
your main application class. 


BOOL CYourMFCAppNameHereApp :: Onldle (LONG ICount) 

{ 

CWinApp :: Onldle (ICount ) ; // Call the parent default Onldle handler 

// Our game's heartbeat function 
HeartBeat ( ) ; 

// Request more idle-time, so that we can render the next loop! 
return TRUE; 

} 


Win32: We can call the heartbeat function from inside the message loop, by using the 
function PeekMessage in our WinMain function to determine if we have any messages 
waiting: 


g_bRunning = true; 
while (g_bRunning) 

{ 

while (PeekMessage (&Msg, g_hwndMain, 0, 0, PM_NOREMOVE) ) 
{ 

BOOL bGetResult = GetMessage ( &Msg, NULL, 0, 0) ; 
TranslateMessage (&Msg) ; 

DispatchMessage (&Msg) ; 
if (bGetResult==0 ) 

g_bRunning = false; 

} 

if (g_bRunning) 

{ 

CheckSurf aces ( ) ; 


HeartBeat ( ) ; 


} 


There are alternate ways to decide when to call the HeartBeat function, for example you could 
use a timer. The method you use depends on the type of game you are making. If you are 
making a first-person 3D shooter, you probably want as high a frame rate as possible, so you 
might use the idle-time method. If you are making a 2D scrolling game, this might not be 
optimal, as you may want to control the frame rate. 

The HeartBeat function 

Now let's look at the heartbeat function. The function checks for lost surfaces, then clears the 
back buffer with black, then draws a color square to the back buffer, and then flips the back 
buffer to the front. 


void HeartBeat () 

{ 

// Check for lost surfaces 
CheckSur faces ( ) ; 

// Clear the back buffer 

DDClear ( g_pDDSBack, 0, 0, 320, 240 ); 

static int iFoo = 0; 

// Draw a weird looking color square 
for ( int r=0 ; r<64; r++ ) 

{ 

for ( int g=0; g<64; g++ ) 

{ 

DDPutPixel ( g_pDDSBack, g, r, (r*2 + iFoo) %256, 

(g+iFoo) %256, (63-g)*4 ); 

} 

} 

iFoo++; 

// Blit the back buffer to the front buffer 
DDF lip ( ) ; 


The DDPutPixel function used here is already explained. 

Flipping surfaces 

Now let's look at the function that performs the surface flipping. 


void DDFlip ( ) 

{ 

HRESULT hr; 

// if we're windowed do the blit, else just Flip 
if ( ! g_bFullScreen) 

{ 


RECT rcSrc; // source blit rectangle 

RECT rcDest; // destination blit rectangle 

POINT p; 


NULL) ; 

} 

else 

{ 

} 

} 


// find out where on the primary surface our window lives 
p.x = 0; p.y = 0; 

: : ClientToScreen (g_hWnd, &p) ; 

: : GetClientRect (g_hWnd, SrcDest) ; 

Of f setRect ( SrcDest, p.x, p.y); 

SetRect (SrcSrc, 0, 0, 320, 240); 

hr = g_pDDS->Blt ( SrcDest, q pDDSBack, SrcSrc, DDBLT_WAIT, 


hr = g_pDDS->Flip (NULL, DDFLIP_WAIT) ; 


A primary surface in windowed mode represents the entire Windows screen, so we have to 
first find out where on the screen our window is, and then translate by that offset in order to 
blit into the Window. 

Note the Bit parameter DDBLT_WAIT. By default, if a surface is "busy" when you call Bit (for 
example if the GDI is accessing it) then DirectDraw will return an error, without performing 
the blit. Passing the DDBLT_WAIT option will instruct DirectDraw to wait until the surface 
becomes available and then perform the blit. 

Cleaning up 

When we're done with DirectX objects, we have to "release" them, which is done by calling 
Release on them, for example: 


void DDDone ( ) 

{ 

if ( g_pDD ! = NULL) 

{ 

g_pDD->Release ( ) ; 
g_pDD = NULL; 

} 

} 


Sample TODO 

There are a few things the sample can't do yet. For one thing, full-screen mode doesn't work 
properly yet. It should also demonstrate how to handle switching between windowed and full- 
screen modes. 

CH 4: A simple Direct3D Retained mode sample 


Direct3D: An Overview 


Over here I'll shove in some basics, like coordinate systems, world and object coordinate 
systems, etc. For now I'll assume you're at least a little familiar with 3D programming. Blah 
blah blah, differences between immediate and retained mode, etc etc. 


Devices 

Direct3D interfaces with the surface it is rendering to (e.g. screen memory, system memory) 
using an IDirect3DRMDevice object. More than one type of rendering device can exist and a 
specific rendering device must be chosen for a scene. For example, there is normally a device 
for RGB rendering and a device for Mono rendering (these names refer to the lighting model 
used for rendering. Mono means that only white lights can exist in the scene, while RGB 
supports colored lights, and is thus slower). Additional devices may be installed that make use 
of 3D hardware acceleration. It is possible to iterate through the installed D3D devices by 
enumarating through them (EnumDevices). It is possible to have two different devices 
rendering to the same surface. 

Viewports 

The IDirect3DRMViewport object is used to keep track of how our 3D scene is rendered 
onto the device. It is possible to have multiple viewports per device, and it is also possible to 
have a viewport rendering to more than one device. The viewport object keeps track of the 
camera, front and back clipping fields, field of view etc. 


Frames 

A frame in Direct3D is basically used to store an object's position and orientation information, 
relative to a given frame of reference, which is where the term frame comes from. Frames are 
positioned relative to other frames, or to the world coordinates. Frames are used to store the 
positions of objects in the scene as well as other things like lights. OK, so I'm explaining it 
badly. It's late, I'm tired, I'll revise it soon. To add an object to the scene we have to attach 
the object to a frame. The object is called a visual in Direct3D, since it represents what the 
user sees. So, a visual has no meaningful position or orientation information itself, but when 
attached to a frame, it is transformed when rendered according to the transformation 
information in the frame. Multiple frames may use the same visual. This can save a lot of time 
and memory in a situation like, for example, a forest or a small fleet of spacecraft, where you 
have a bunch of objects that look exactly the same but all exist in different positions and 
orientations. 

Here is a crummy ASCII diagram of a single visual attached to two frames which are at 
different positions: 



If both of these frames were attached to the scene frame, then our scene would have 2 cubes 
in it; one at (21, 3, 4) and the other at (-12, 10, -6). 


The Direct3D RM Sample 


Firstly, heres a screenshot of the small simple sample application we're putting together here. 


Setting up global variables 

Before we start we'll need a few global variables. 


LPDIRECTDRAW pDD ; 

LPDIRECT3DRM pD3DRM; 
LPDIRECTDRAWSURFACE pDDSPrimary; 
LPDIRECTDRAWSURFACE pDDSBack; 
LPDIRECTDRAWP ALETTE pDDPal; 
LPDIRECTDRAWCLIPPER pClipper; 
LPDIRECT3DRMDEVICE pD3DRMDevice; 
LPDIRECT3DRMVIEWP0RT pViewport; 
LPDIRECT3DRMFRAME pCamera; 
LPDIRECT3DRMFRAME pScene; 
LPDIRECT3DRMFRAME pCube; 

BOOL bFullScreen; 

BOOL bAnimating; 

HWND ddWnd; 


// A DirectDraw object 

// A Direct3D RM object 

// DirectDraw primary surface 

// DirectDraw back surface 

/ / Palette for primary surface 

/ / Clipper for windowed mode 

// A device 

//A viewport 

// A camera 

// The scene 

// The one and only object in 
// our scene 

// Are we in full-screen mode? 
// Has our animating begun? 

// HWND of the DDraw window 


Note that we need both a DirectDraw object and a Direct3D object to create a Direct3D 
application. This is because Direct3D works in conjunction with DirectDraw. As before, we 
need a primary and a back surface for our double-buffering, and a clipper to handle window- 
clipping in windowed mode. The palette object is still not discusses in this tutorial (yet). We 
have objects for the device and viewport, and we have frame objects to keep track of the 
scene and the scene's camera. Also, we have a frame that is used for the object we'll have in 
this scene. 

Here is a routine just to initially flatten these globals: 


void InitDirectXGlobals ( ) 

{ 

pDD = NULL; 
pD3DRM = NULL; 
pDDSPrimary = NULL; 
pDDSBack = NULL; 


pDDPal = NULL; 
pClipper = NULL; 
pD3DRMDevice = NULL; 
pViewport = NULL; 
pCamera = NULL; 
pScene = NULL; 
pCube = NULL; 

bFullScreen = FALSE; 
bAnimating = FALSE; 


From 'Initializing the DirectDraw system' to 'Creating the clipper' 

These steps all proceed exactly as in the DirectDraw sample, with the exception of the 
CreateSurface function, where the back surface has to created with the DDSCAPS_3DDEVICE, 
since it will be used for 3d rendering: 


UINT CreatePrimarySurf ace ( ) 
{ 


// Create an offscreen surface, specifying 3d device 

ddsd . ddsCaps . dwCaps = DD S CAP S_OFFSCREENP LAIN | DDSCAPS_3DDEVICE; 


Creating the Direct3D Retained Mode object 

Now we need to create an IDirect3DRM object. This is achieved, quite simply, by calling the 

Direct3DRMCreate function. 


UINT CreateDirect3DRM ( ) 

{ 

HRESULT hr; 

// Create the IDirect3DRM object, 
hr = Direct3DRMCreate (&pD3DRM) ; 
if (FAILED (hr)) { 

TRACE ("Error creating Direct3d RM object\n"); 
return 1; 

} 

return 0; 

} 


Creating the device for rendering 


We create the device object from the back surface, since this surface is the one we will render 
to. 


UINT CreateDevice ( ) 

{ 

HRESULT hr; 

hr = pD3DRM->CreateDeviceFromSur f ace ( 

NULL, pDD , pDDSBack , &pD3DRMDevice ) ; 
if (FAILED (hr)) { 

TRACE ("Error %d creating d3drm deviceXn", int (LOWORD (hr) ) ) ; 
return 1; 

} 

// success 
return 0; 

} 


Creating the viewport 

We do a bit more than just create the viewport here. We create the scene object and the 
camera object, as well as set the ambient light for the scene, and create a directional light. 


UINT CreateViewport ( ) 

{ 

HRESULT hr; 

/ / First create the scene frame 

hr = pD3DRM->CreateFrame (NULL, SpScene) ; 

if (FAILED (hr)) { 

TRACE ("Error creating the scene frameXn"); 
return 1; 

} 

// Next, create the camera as a child of the scene 
hr = pD3DRM->CreateFrame (pScene, SpCamera) ; 
if (FAILED (hr)) { 

TRACE ("Error creating the scene frameXn"); 
return 2; 

} 

// Set the camera to lie somewhere on the negative z-axis, and 
// point towards the origin 
pCamera->SetPosition ( 

pScene, D3DVAL(0.0), D3DVAL(0.0), D3DVAL (-300 . 0) ) ; 
pCamera->SetOrientation ( 
pScene, 

D3DVAL (0.0) , D3DVAL (0.0) , D3DVAL(1.0), 

D3DVAL (0.0) , D3DVAL (1.0) , D3DVAL (0.0) ) ; 

// create lights 

LPDIRECT3DRMLIGHT pLightAmbient = NULL; 

LPDIRECT3DRMLIGHT pLightDirectional = NULL; 

LPDIRECT3DRMFRAME pLights = NULL; 

// Create two lights and a frame to attach them to 
// I haven't quite figured out the CreateLight ' s second 
/ / parameter yet . 

pD3DRM->CreateFrame (pScene, SpLights) ; 

pD3DRM->CreateLight (D3DRMLIGHT_AMBIENT , pD3DRMCreateColorRGB ( 

D 3DVALUE (0.3) , D3DVALUE (0.3) , D3DVALUE (0.3) ) , 

SpLightAmbient ) ; 

pD3DRM->CreateLight (D3DRMLIGHT_DIRECTIONAL, D3DRMCreateColorRGB ( 
D 3DVALUE (0.8) , D3DVALUE (0.8) , D3DVALUE (0.8) ) , 


SpLightDirectional ) ; 

// Orient the directional light 
pLights->SetOrientation (pScene, 

D3DVALUE (30 . 0) , D3DVALUE (-20 . 0) , D3DVALUE (50.0) , 

D 3DVALUE (0.0) , D 3DVALUE (1.0) , D3DVALUE (0.0) ) ; 

// Add ambient light to the scene, and the directional light 
// to the pLights frame 
pScene->AddLight (pLightAmbient ) ; 
pLights->AddLight (pLightDirectional ) ; 

// Create the viewport on the device 
hr = pD3DRM->CreateViewport (pD3DRMDevice, 
pCamera, 10, 10, 300, 220, SpViewport ) ; 
if (FAILED (hr)) { 

TRACE ("Error creating viewport\n" ) ; 
return 3; 

} 

// set the back clipping field 

hr = pViewport->SetBack (D3DVAL ( 5000 . 0 ) ) ; 

// Release the temporary lights created. It seems 
/ / they will have been copied for the scene during AddLight 
pLightAmbient->Release ( ) ; 
pLightDirectional->Release () ; 

// success 
return 0; 


Putting it all together 

Here is the tail-end of the app's Initlnstance function: 


InitDirectXGlobals () ; 

TRACE ( "Calling InitDDraw\n" ) ; 

InitDDraw ( ) ; 

SetMode ( ) ; 

// TRACE ( "Calling LoadJascPalette\n" ) ; 

// LoadJascPalette ( "inspect . pal " , 10, 240); 

TRACE ( "Calling CreatePrimarySurface\n" ) ; 
CreatePrimarySurf ace () ; 

TRACE ( "Calling CreateClipper\n" ) ; 
CreateClipper ( ) ; 

// TRACE ( "Calling AttachPalette\n" ) ; 

// AttachPalette (pDDPal ) ; 

TRACE ( "Calling CreateDirect3DRM\n" ) ; 
CreateDirect3DRM ( ) ; 

TRACE ( "Calling CreateDevice\n" ) ; 
CreateDevice ( ) ; 

TRACE ( "Calling CreateViewport\n" ) ; 
CreateViewport () ; 

TRACE ("Calling CreateDef aultScene\n" ) ; 
CreateDef aultScene ( ) ; 

bAnimating = TRUE; 

return TRUE; 


} 


Restoring lost surfaces 

Same as the DirectDraw sample: 


BOOL CheckSurf aces ( ) 

{ 

/ / Check the primary surface 
if (pDDSPrimary ) { 

if (pDDSPrimary->IsLost ( ) == 
pDDSPrimary->Restore ( ) ; 
return FALSE; 

} 

} 


return TRUE; 

} 


DDERR_SURFACELOST) { 


The Rendering loop 

Same as the DirectDraw sample: 


BOOL CD3dRmAppApp : : Onldle (LONG ICount) 

{ 

CWinApp : : Onldle ( ICount ) ; 
if (bAnimating) { 

HeartBeat ( ) ; 

Sleep ( 50 ) ; 

} 

return TRUE; 

} 


The HeartBeat function 


BOOL CD3dRmAppApp : : HeartBeat () 

{ 

HRESULT hr; 

// if (! CheckSurf aces ) bForceUpdate = TRUE; 

// if (bForceUpdate) pViewport->ForceUpdate (10, 10, 300, 220) ; 
hr = pD3DRM->Tick (D3DVALUE (1.0)); 
if (FAILED (hr)) { 

TRACE ("Tick error !\n"); 
return FALSE; 

} 

// Call our routine for flipping the surfaces 
FlipSurfaces () ; 

// No major errors 
return TRUE; 



